Access the Bioinfo-Portal
The portal was accessed 3664 times.
885 applications were executed.

Tutorial - 2019

1. The Brazilian Bioinformatics Network

The Brazilian Bioinformatics Network (RNBio) was created in 2014 as a result of a Structuring Program of the Ministry of Science, Technology and Innovation (MCTI) that aims at strengthening research involving the use of bioinformatics in Brazil. The core founder of RNBio is formed by researchers linked to three institutions: the National Laboratory for Scientific Computing (LNCC), the National Biosciences Laboratory (LNBio/CNPEM) and the Federal University of Minas Gerais (UFMG). The RNBio also aimed to work closely with the High Performance Processing National Centers (CENAPADs) structured in the National System of High Performance Computing (SINAPAD). In addition, the network maintains an active collaboration for researching with Brazilian institutions as the University of São Paulo, University of Campinas, Federal University of Rio de Janeiro and the Fluminense Federal University . The network’s mission is to foster the development of research projects in a multicentric format and the training of human resources in thematic studies involving computational biology. In this way, one of the network’s objectives is the provision of a Bioinformatics Portal, actually functional, managed and located at LNCC.

2. Bioinfo -Portal: The Bioinformatics Portal for Parallel and Distributed Execution with High Performance Computing

Bioinfo -Portal (https://bioinfo.lncc.br/) is a bioinformatics portal developed with efforts of the RNBio contributors (Figure 1). It is an easy-to-use service that follows the Software as a Service (SaaS) delivery model and was built on top of CSGrid (https://jira.tecgraf.puc-rio.br/confluence/display/CN/CSGrid+Home) to facilitate and support the parallel and distributed executions of several bioinformatics applications i.e. programs, packages or scientific workflows. Bioinfo-Portal is currently managed by a team with members from the Bioinformatics Laboratory (LABINFO/LNCC) (www.labinfo.lncc.br) and the National System of High Performance Computing (SINAPAD/LNCC) (https://www.lncc.br/sinapad/).

Figure 1. The Bioinfo -Portal main page

2.1 Availability and Requirements

1.       Project name                           :            Bioinfo -Portal

2.       Domain                                   :            https://bioinfo.lncc.br/

3.       Operating system(s)                :            Platform independent

4.       Programming language           :            Java, Bash, Python, and Perl

5.       License                                    :            None

6.       Attribution                              :            Attribution-ShareAlike 3.0 Unported License

7.       Restrictions for using             :            None

2.2 Available Applications

The Bioinfo-Portal v.2-2019 presents the following bioinformatics applications (for futures versions others can be included as they are required by users/developers):

1.     Programs/packages most widely used in several bioinformatics areas, as presented in more details at the Supplementary Table 1, they are: Align-m, bcftools, BEAST2, Bowtie2, bwa, ClustalW2, codeml, ExaML, FragGeneScan, GeneMarkS, Glimmer3, HMMER3, Kalign2, MAFFT, MetaGeneMark, ModelGenerator, MUSCLE, NxTrim, PartitionFinder, PHYLIP, ProbCons, RAxML, Ray, ReadSeq, samtools , SPAdes , and T-Coffee.

2.     Scientific workflows (Supplementary Table 2) modeled, managed and executed with the Scientific Workflow Management Systems (SWfMS) SciCumulus (https://scicumulusc2.wordpress.com/) and Swift (https://www.swift.com/).

3. Submitting, Executing and Receiving Results from a Job

The Bioinfo-Portal is available and functional at https://bioinfo.lncc.br/. In its version, scientists are able to run just one input data for job submission. Posterior versions of Bioinfo-Portal will allow for executing jobs with more data input, thus benefiting from the parallelism and distribution features offered by the CSGrid environment. Table 1 presents the Bioinfo -Portal tutorial that details the main guide steps to be followed by users.


Table 1. Bioinfo -Portal `s Tutorial

1. Go to the Application tab (at the left corner)

2. Elect a program or workflow

3. All program/workflow panel presents:

a. Default parameters for specifying input/output data and specific parameters

b. Example test files for input data

c. E-mail for returning results

d. reCAPATCHA (“ I am not a robot �), for validating submissions

e. Bottom for running

4. The symbol (?) is presented in some parts of Bioinfo-Portal for presenting any additional information about parameters e.g. , format or type of sequences

5. For instance, running the application Align-m

6. Select parameters

7. Select the input data or example test file

a. The input data was limited to 100 Mb

8. Fill the E-mail field to receipt the link for downloading the results when the job is finished, this field is mandatory

a. This link is accessible for 2 (two) weeks

9. Bioinfo -Portal uses reCAPTCHA to authenticate the user and to protect the portal from spam or abusive scripts that can access sites

10. The reCAPTCHA ’ example is based on a classic computer vision problem of image labeling

11. Users are asked for selecting all the images that correspond with the clue

12. If the reCAPTCHA’s authentication is valid :

a. Users are notified as “I’m not robot�

b. Then, they are allowed to run the application

13. If the authentication is not valid , please go to the Table 2. Possible Errors Notified by Bioinfo -Portal v.1 in Section 3

 

14. Executing the job:

a. Users will receive a message indicating the Application (app) is running

b. Users will receive an E-mail when the job is finished

c. Users can check the job status by clicking the link

d. Or users can return to the main page to submit a new application’s job

 

15. Users can reload the link to access the status ( i.e. , executing or finished ) of the execution or they can wait for receipt the E-mail until when the job is finished

16. Users receive an E-mail confirming that the job finished

 

17. The E-mail contains a message with the response (job successfully executed or not) and link for downloading the execution results

18. If the job is not successfully executed , please go to the Table 2. Possible Errors Notified by Bioinfo -Portal v.1 in Section 3

19. Users can download the results by clicking in here

20. Users receiving a directory zipped containing the results

21. Final results obtained

 


4. Limitations in Bioinfo -Portal v.1

For the present version of Bioinfo -Portal, we defined some limitations related to the submission, execution and receiving activities related to the jobs. They are:

a.          The submission is limited to only one (1) job;

b.         The accessibility for the link with the final results is limited to one (1) week;

c.          The total length of the input files is limited to 100 Mb;

d.         The total time execution for a job is limited to one (1) day.

5. Abbreviations

App: Application; HPC: high performance computing; LABINFO: Bioinformatics Laboratory; LNCC: National Laboratory of Scientific Computing; MSA: Multiple Sequence Alignment; SaaS : Software as a Service; SWfMS: Scientific Workflow Management Systems; SINAPAD: National System of High Performance Computing.

6. Possible Error

Table 2 presents the possible errors that can be found in Bioinfo -Portal. Other any errors can be reported directly to the contact karyann at lncc.br.

Table 2. Possible Errors Notified by Bioinfo -Portal v.1

1. If the reCAPTCHA’s authentication is not valid , that message will be presented

a. In this case, users need to resubmit the reCAPTCHA, verifying that the image selection process is correct

 

2. If the input data exceed the limit size of 100 Mb

a. In this case, users need to resubmit using an input data with size less than 100 Mb

 

3. If the job is not successfully executed , that message will be presented

a. In this case, users need to analyze/validate/review if the input data or parameters are correct

4. Users/specialists must analyze the error message returned by the chosen application, and/or

5. to validate/verify the input data ( e.g. , formats, features), and/or

6. to adequate required parameters needed for the chosen application, and/or

7. previously to test:

a. Users can run a test job executing the example test file (available for all Bioinfo-Portal applications)

b. If error persists, please contact us

 

[1]

8. Team

LABINFO-LNCC

D. Sc. Ana Tereza Ribeiro de Vasconcelos, D.Sc. Kary A. D. C. S. Ocaña

LNBIO/CNPEM

Ph. D Paulo Sergio Lopes de Oliveira

SINAPAD-LNCC

D. Sc. Antônio Tadeu Azevedo Gomes, M.Sc. Bruno Fernandes Bastos, D.Sc. Luiz Manoel Rocha Gadelha Jr., B.Sc. Marcelo Monteiro Galheigo , M.Sc. Vívian Medeiros

UFMG

Dr. Santuza Maria Ribeiro Teixeira

Collaboration

D. Sc. Gabriela Flávia Rodrigues Luiz (UFMG), M.Sc. José Geraldo de Carvalho Pereira ( University of São Paulo), M.Sc. Sheila Tiemi Nagamatsu ( University of Campinas)

9. References

[1 ]   Mondelli M, Torreño O, Ocaña K, Mattoso M, Wilde M, Vasconcellos A, T relles O, and Gadelha L, “SwiftGecko: a provenance-enabled parallel comparative genomics workflow,� presented at the X-Meeting International Workshop and Proceedings, SP, Brazil, 2015.

[2]   O. Torreno and O. Trelles, “Breaking the computational barriers of pairwise genome comparison,� BMC Bioinformatics , vol. 16, no. 1, Dec. 2015.

10. Download the Bioinfo-Portal tutorial

Tutorial-Bioinfo -Portalv1.pdf