Function in biology
Complex term. Function can be said as what something (e.g., molecule) does.
Under the evolutionary umbrella, function has no a purpose.
Functional terms
Synonyms: keywords, glossaries, vocabularies, etc.
- UniProt controlled vocabulary
- UniProt keywords
- Enzyme nomenclature Database (EC)
- etc.
- COG
- KEGG Orthology
- Online Mendelian Inheritance in Man
Ontologies
- Ontology: a more complex case of a controlled vocabulary: formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse. (Ref)
Gene Ontology
Browsing tools:
Actually 3 ontologies:
- Molecular function
- Biological process
- Cellular location
Terms are linked by different kind of relationships: is a
, part of
, regulates
, etc.
Graph representations
Types of graph in biology:
- Simple Directed: Taxonomy (1 parent)
- Acyclical Directed: Gene Ontology (1 or more parents)
- Undirected: Metabolic network relationships
Other ontologies
Homology: Orthology, analogy, paralogy
- Ortholog: Same ancestor origin. Speciation event
- Analog: Not the same origin (despite having the same structure and/or function)
- Paralog: Same former ancestor origin, but duplication event also involved
Moonlighting
- Databases: MoonProt && MultiTaskDB
- Example: P99999 - Cytochrome c
- It transfers electrons between Complexes III (Coenzyme Q - Cyt C reductase)
- Controlling apoptosis
- Example: P99999 - Cytochrome c
Structure
-
Primary (sequence)
-
Secondary
- Protein
- RNA
- Tertiary
- Protein:
- Domains
- Binding site
- Catalytic site.
- Protein:
- Quaternary
- Protein:
- Protein complexes
- Protein-protein interaction
- Protein-Nucleotide
- Protein:
Warming up exercise
- Find which is this protein and download as a FASTA file.
>protein S
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFR
SSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIR
GWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY
SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQ
GFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFL
LKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITN
LCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCF
TNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYN
YLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPY
RVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFG
RDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAI
HADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPR
RARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTM
YICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFG
GFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFN
GLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQN
VLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGA
ISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMS
ECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAH
FPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELD
SFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELG
KYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSE
PVLKGVKLHYT
-
Download the associated genome for this protein and save as FASTA.
-
Find a mRNA molecule related to this (check 1st session Entrez help reference)
Structural and functional annotation
- Summarizing: Assign a function based on a feature or metric derived from either simply the sequence and/or other sources of information.
-
KEY QUESTION: Would be the actual sequences enough for assigning in the end any kind of function?
- Secondary structure prediction
- Generic (e.g. PredictProtein). More…
- RNA Structure prediction software
-
Signal Peptides. Example: SignalP
- Subcellular location
- Protein. List Example Program: LocTree3
- RNA. Example: RNALocate
- Exercise: Test our protein (and your RNA if possible) with the different tools above.
Sequence-based search
PRINTS
- PRINTS - Motifs/fingerprints
PROSITE
- PROSITE - Patterns or profiles
PATTERN (doc)
PA P-x-[STA]-x-[LIV]-[IVT]-x-[GS]-G-Y-S-[QL]-G.
PROFILE (doc) Representation of a matrix…
MEME
-
For nucleotides: MEME suite
-
Exercise: Test our protein (our DNA and your RNA if possible) with the different tools above.
Domain search
Databases
- SMART
- PFAM (and RNA equivalent: RFAM)
- CATH
- Exercise: try our protein in PFAM search, CATH and CDD and check results.
- Exercise: try your mRNA in RFAM search.
Platforms that combine several tools
-
Exercise: Test our protein with a few of the provided tools (only check a few options otherwise it will take too long)
InterPro
Families combined from different methods from databases described above.
-
Exercise: Test our protein with InteProscan
KEGG
-
Exercise: Test our protein