Current version: 1.1.1, Jul 18, 2023
Bioconvert¶
Bioconvert is a collaborative project to facilitate the interconversion of life science data from one format to another.

- contributions:
Want to add a convertor ? Please join https://github.com/bioconvert/bioconvert/issues/1
Overview¶
Life science uses many different formats. They may be old, or with complex syntax and converting those formats may be a challenge. Bioconvert aims at providing a common tool / interface to convert life science data formats from one to another.
Many conversion tools already exist but they may be dispersed, focused on few specific formats, difficult to install, or not optimised. With Bioconvert, we plan to cover a wide spectrum of format conversions; we will re-use existing tools when possible and provide facilities to compare different conversion tools or methods via benchmarking. New implementations are provided when considered better than existing ones.
In Jan 2023, we had 50 formats, 100 direct conversions available.

Installation¶
BioConvert is developped in Python. Please use conda or any Python environment manager to install BioConvert using the pip command:
pip install bioconvert
50% of the conversions should work out of the box. However, many conversions require external tools. This is why we recommend to use a conda environment. In particular, most external tools are available on the bioconda channel. For instance if you want to convert a SAM file to a BAM file you would need to install samtools as follow:
conda install -c bioconda samtools
Since bioconvert is available on bioconda on solution that installs BioConvert and all its dependencies is to use conda/mamba:
conda env create --name bioconvert mamba
conda activate bioconvert
mamba install bioconvert
bioconvert --help
See the Installation section for more details and alternative solutions (docker, singularity).
Quick Start¶
There are many conversions available. Type:
bioconvert --help
to get a list of valid method of conversions. Taking the example of a conversion from a FastQ file into a FastA file, you could do the conversion as follows:
bioconvert fastq2fasta input.fastq output.fasta
bioconvert fastq2fasta input.fq output.fasta
bioconvert fastq2fasta input.fq.gz output.fasta.gz
bioconvert fastq2fasta input.fq.gz output.fasta.bz2
When there is no ambiguity, you can be implicit:
bioconvert input.fastq output.fasta
The default method of conversion is used but you may use another one. Checkout the available methods with:
bioconvert fastq2fasta --show-methods
For more help about a conversion, just type:
bioconvert fastq2fasta --help
and more generally:
bioconvert --help
You may also call BioConvert from a Python shell:
# import a converter
from bioconvert.fastq2fasta import FASTQ2FASTA
# Instanciate with infile/outfile names
convert = FASTQ2FASTA(infile, outfile)
# the conversion itself:
convert()
Available Converters¶
Converters |
CI testing |
Default method |
---|---|---|
Unix commands |
||
Pandas |
||
DSRC software |
||
pigz/pbzip2 software |
||
DSRC software |
||
Python |
||
pyexcel library |
||
Pandas library |
||
Pandas library |
Contributors¶
Setting up and maintaining Bioconvert has been possible thanks to users and contributors. Thanks to all:
Changes¶
Version |
Description |
---|---|
1.1.1 |
|
1.1.0 |
|
1.0.0 |
|
0.6.3 |
|
0.6.2 |
|
0.6.1 |
|
0.6.0 |
|
0.5.2 |
|
0.5.1 |
|
0.5.0 |
|
0.4.X |
|
0.3.X |
may 2019. new methods abi2qual, bigbed2bed, etc. added --threads option |
0.2.X |
aug 2018. abi2fastx, bioconvert_stats tool added |
0.1.X |
major refactoring to have subcommands with implicit/explicit mode |
Complete documentation including User and Developer Guides¶
- 1. Installation
- 2. User Guide
- 3. Tutorial
- 4. Developer guide
- 4.1. Quick start
- 4.2. Installation for developers
- 4.3. How to add a new conversion
- 4.4. One-to-many and many-to-one conversions
- 4.5. How to add a new method to an existing converter
- 4.6. Decorators
- 4.7. How to add a test
- 4.8. How to add a test file
- 4.9. How to locally run the tests
- 4.10. How to benchmark your new method vs others
- 4.11. How to add you new converter to the main documentation ?
- 4.12. pep8 and conventions
- 4.13. Requirements files
- 4.14. How to update bioconvert on bioconda
- 4.15. Sphinx Documentation
- 4.16. Docker
- 5. Benchmarking
- 6. Gallery
- 7. References
- 8. Formats
- 8.1. TWOBIT
- 8.2. AGP
- 8.3. ABI
- 8.4. ASQG
- 8.5. BAI
- 8.6. BAM
- 8.7. BCF
- 8.8. BCL
- 8.9. BED for plink
- 8.10. BEDGRAPH
- 8.11. BED
- 8.12. BED3
- 8.13. BED4
- 8.14. BED5
- 8.15. BED6
- 8.16. BED12
- 8.17. BIGBED
- 8.18. BIGWIG
- 8.19. BIM
- 8.20. BZ2
- 8.21. COV
- 8.22. CRAM
- 8.23. CLUSTAL
- 8.24. CSV
- 8.25. DSRC
- 8.26. EMBL
- 8.27. FAM
- 8.28. FAA
- 8.29. FASTA
- 8.30. FastG
- 8.31. FastQ
- 8.32. GENBANK
- 8.33. GENPEPT
- 8.34. GFA
- 8.35. GTF
- 8.36. GFF
- 8.37. GZ
- 8.38. JSON
- 8.39. MAF (Mutation Annotation Format)
- 8.40. MAF (Multiple Alignement Format)
- 8.41. MAP
- 8.42. NEWICK
- 8.43. NEXUS
- 8.44. ODS
- 8.45. PAF (Pairwise mApping Format)
- 8.46. PDB
- 8.47. PED
- 8.48. PHYLOXML
- 8.49. PHYLIP
- 8.50. PLINK flat files (MAP/PED)
- 8.51. PLINK binary files (BED/BIM/FAM)
- 8.52. QUAL
- 8.53. SAM
- 8.54. SCF
- 8.55. SRA
- 8.56. TSV
- 8.57. STOCKHOLM
- 8.58. VCF
- 8.59. WIG
- 8.60. WIGGLE (WIG)
- 8.61. XLS
- 8.62. XLSX
- 8.63. XMFA
- 8.64. YAML
- 8.65. Others
- 8.66. IG
- 9. Glossary
- 10. Faqs
- 11. Bibliography