# MAP-to-FASTA Converter

## Description

This Python script, using the Streamlit framework, provides a user-friendly tool for converting biological sequence data from a specific CSV format (MAP/Tagged) to the widely used FASTA format. The tool is designed to handle sequence data that includes annotations within the sequence strings, denoted by curly braces `{}`. It cleans these annotations, extracts the pure sequence, and allows the user to export the cleaned sequences in either CSV or FASTA format.

## Key Features

* **Input:**
    * Accepts CSV files as input.
    * Handles CSV files with or without a header row.
    * Detects the column containing the sequence data with `{}` tags.
* **Processing:**
    * Removes `{}`-style annotations from the sequence data.
    * Extracts cleaned sequences.
* **Output:**
    * Exports cleaned data in CSV or FASTA format.
    * FASTA output includes user-defined organism and function metadata.
* **User Interface:**
    * Provides a simple and intuitive web interface using Streamlit.
    * Allows users to upload CSV files, enter metadata, select output format, and download the converted data.
* **Error Handling:**
    * Provides informative error messages for common issues, such as missing annotation tags.

## Requirements

To run this script, you need the following Python libraries:

* [Streamlit](https://streamlit.io/): For creating the web interface.
* [Pandas](https://pandas.pydata.org/): For handling CSV file processing.
* [re](https://docs.python.org/3/library/re.html): Python's regular expression library.  (Part of the standard library)

You can install these libraries using pip:

```bash
pip install streamlit pandas