README ====== The Kew Tree of Life Explorer allows users to explore evolutionary trees of life and to access the genomic data that underpin them. It is an output of the Plant and Fungal Trees of Life Project (PAFTOL) at the Royal Botanic Gardens, Kew (https://www.kew.org/science/our-science/projects/plant-and-fungal-trees-of-life), which aims to discover and disseminate the evolutionary history of all plant and fungal genera through phylogenetic approaches. Tree of Life data are periodically released via the Kew secure file transfer protocol (SFTP) site (sftp.kew.org/pub/treeoflife) and is additionally made available for interactive web-based exploration at http://treeoflife.kew.org. This document contains the following sections: 1. The Kew Tree of Life SFTP site Overview of directory structure and files contained within each directory 2. File naming conventions 3. FASTA headers 4. Manifests Description of file formats for sequence_manifest.txt, deleted_sequences.txt, specimen_manifest.txt, revised_specimen_nomenclature.txt, gene_manifest.txt 1. The Kew Tree of Life SFTP site ================================= |-- README.txt This document | |-- current_release A link to the current release of Kew Tree of Life Explorer | |-- releases A directory containing all previous releases of Kew Tree of Life data | |-- One directory for each Kew Tree of Life data release | |-- kew_tree_of_life_release_notes_.txt A document describing the | the contents of the release | |-- kew_tree_of_life_release_notes.txt A symlink to the above file | |-- sequence_manifest.txt A document listing the accession numbers (in public | repositories) of all nucleotide sequence data used in the | release | |-- deleted_sequences.txt A document listing the accession numbers (in public | repositories) of all nucleotide sequence data used in | previous releases of the Kew Tree of Life that have not | been used in this one | |-- specimen_manifest.txt A document listing the scientific name of all | species included in this release, with additional | information about the specimens which have been | sampled | |-- revised_specimen_nomenclature.txt A document identifying changes in | specimen nomenclature between | successive releases | |-- gene_manifest.txt A document listing the genes included in this release | | |-- fasta A directory containing gene sequence in FASTA format. Sequences are | | generated from recovery processes, for a number of specified genes, and | | according to a specified method | | | |-- alignments A directory containing alignment data for each gene, in aligned | | FASTA format. | | | |-- by_gene A directory containing files containing all assembled sequences for | | a given gene | | | |-- by_recovery A directory containing files containing all assembled sequences | for a given recovery | |-- tree A directory containing tree files in Newick format for genes and species | | | |-- gene A directory containing tree files in Newick format for each gene used | | to build the species tree | | | |-- species A directory containing the species tree file at species, family and order levels for this release in | Newick format | |-- nex | | | |-- species A directory containing the species tree file at species, family and order levels for this release in | NEXUS format | |-- svg | |-- species A directory containing the species tree file at species, family and order levels for this release in SVG format 2. File naming conventions ========================== Files in fasta/alignments ------------------------- gene_id..aln.fasta An alignment is built for each gene from the corresponding sequences in fasta/by_gene. Sequences with poor coverage of the alignment have been removed as described in Baker et al. (https://doi.org/10.1093/sysbio/syab035). Files in fasta/by_gene ---------------------- ..fasta The Gene ID identifies the pan-species gene concept, and is taken from the Angiosperms353 data set (Johnson et al. 2019, https://doi.org/10.1093/sysbio/syy086). Molecule types used in this release: DNA Protein files may be provided in future releases. Files in fasta/by_recovery -------------------------- ....fasta A “recovery” is a bioinformatic analysis of a set of sequence data from a single specimen, yielding a set of gene sequences. All sequence sets used for recoveries are accessioned in a public repository. Repositories in use in this release: INSDC: The ENA/GenBank/DDBJ International Nucleotide Sequence Database Collaboration (INSDC). oneKP: The data repository of the One Thousand Plant Transcriptomes Initiative, hosting the OneKP assembled transcriptomes in this release, available at https://datacommons.cyverse.org/browse/iplant/home/shared/commons_repo/curated/oneKP_capstone_2019 Figshare: An open access repository hosting data from Zhao et al. (2025), used in the release. The sequence_id is the identifier used for those sequences within the named repository. For merged samples, where the raw data consists of two or more sequencing runs from the same specimen, the sequence_id is an underscored-separated concatenation of the corresponding ENA run accessions (e.g. ERR16917359_ERR7621247 for Asteriscus intermedius). Sequence sets in use in this release: a353: the Angiosperms353 gene set (see Johnson et al. 2019, https://doi.org/10.1093/sysbio/syy086) The files in this directory always contain DNA sequence. It is not anticipated that protein sequence files will be made available on a per recovery basis. Files in tree/gene ------------------- gene_id.tree A gene tree for each gene from the corresponding alignments in fasta/alignments, in Newick format. Nodes are labelled as follows: ____ Where the sequence ID is the identifier of the sequence derived from the sample as stored in a sequence repository. Further details are provided in the sequence manifest. Files in tree/species --------------------- treeoflife..tree Treeoflife.current.tree treeoflife.4.0.family.tree treeoflife.4.0.order.tree treeoflife.wastral.4.0.tree (release 4.0 only) A file containing the Kew "tree of life" for all species included in this release in Newick format. The file name contains the release ID; a symlink to the current tree is provided with every release for convenient download. treeoflife.all_support_values..tree treeoflife.all_support_values.current.tree A file containing the Kew "tree of life" for all species included in this release in Newick format, with the inclusion of all support value data for each node defined as follows: q1: quartet support for the main topology q2: quartet support for the first alternative topology q3: quartet support for the second alternative topology f1: number of quartet trees in all the gene trees that support the main topology f2: number of quartet trees in all the gene trees that support the first alternative topology f3: number of quartet trees in all the gene trees that support the second alternative topology pp1: local posterior probability for the main topology pp2: local posterior probability for the first alternative topology, pp3: local posterior probability for the second alternative topology QC: number of quartets defined around each branch (ASTRAL III and ASTRAL-MP only) EN: effective number of genes for the branch (ASTRAL III and ASTRAL-MP only) Nodes are labelled (in both trees) as follows: ____ Where the sequence ID is the identifier of the sequence derived from the sample as stored in a sequence repository. For recoveries generated from multiple sequencing runs from the same specimen, an underscore-separated list of the corresponding run accessions is used (e.g. ERR16917359_ERR7621247 for Asteriscus intermedius). Further details are provided in the sequence manifest. Files in nex ------------ treeoflife..nex Treeoflife.current.nex treeoflife.4.0.family.nex treeoflife.4.0.order.nex A file containing the Kew "tree of life" for all species included in this release in NEXUS format. The file name contains the release ID; a symlinked file to the current tree is provided with every release for convenient download as well as order and family trees Files in SVG ------------ treeoflife..svg Treeoflife.current.svg treeoflife.4.0.family.svg treeoflife.4.0.order.svg A file containing the Kew "tree of life" for all species included in this release in Scalable Vector Graphics (SVG) format. The file name contains the release ID; a symlinked file to the current tree is provided with every release for convenient download as well as order and family trees 3. FASTA headers ================ Sequences in FASTA files have headers as follows: \> Gene_Name: Species: Repository: Sequence_ID: The Gene ID identifies the pan-species gene concept, and is taken from the Angiosperm 353 data set (Johnson et al. 2019, https://doi.org/10.1093/sysbio/syy086). The gene name is an exemplar gene name for the gene that has been recovered (i.e., in use for this gene in at least one of the species from which the gene has been recovered). It is not necessarily the name by which the gene is known in the recovered species. All instances of this gene are assigned the same name in a single release. The gene name is not guaranteed to be stable between releases. To identify the same gene in successive releases, use the Gene ID. If no suitably named exemplar gene has been found the gene name is given as ‘NA’. The species name comprises genus and species names in accordance with scientific convention and uses underscores in place of spaces. The sequence repository and sequence identifier as defined as in the names of the files in the fasta/by_recovery directory (see above). 4. Manifests ============ Deleted Sequences ----------------- This (deleted_sequences.txt) is a tab-delineated file, with columns as follows: 1. Repository name 2. Sequence identifier 3. Sequence type. One of genome, transcript, read. 4. Scientific species name 5. Release ID where first included 6. Release ID from which sequence was deleted 7. Reason for deletion For samples generated from multiple sequencing runs, the sequence identifier is a underscore-separated list of the corresponding run accessions (e.g. ERR16917359_ERR7621247 for Asteriscus intermedius). The values of “Reason for deletion” currently are: Failed_family_identification: the taxonomic identity of the sequence generated was inconsistent with the sequence obtained at known barcoding loci, or the sample placed in the wrong family in the preliminary tree, in accordance with the procedure described in Baker et al., 2022 (https://doi.org/10.1093/sysbio/syab035). Manual curation: A specimen was excluded from the current data release if, in preliminary analyses, it failed to occupy a credible position within families, e.g., falling outside the expected subfamilies or excessively far from congeneric or conspecific samples. Permanently_excluded: A specimen was permanently excluded from the data release if expert review determined that it did not match its species identification or that it had failed to occupy a credible position in the phylogeny in previous analyses (specifically, if it failed the pre-release sample family validation pipeline twice). Other reasons to permanently exclude a sample include: poor gene recovery, potential contamination, or, for external data, an updated or removed accession. Other: A specimen may have been excluded from a release if it was displaced by a better recovery for the same species, including instances where the original sequence was replaced by merged sequencing runs from the same specimen. In the case of OneKP samples, better recovery from transcriptome raw reads may have replaced an assembled transcriptome from the same sample included in previous releases. Other reasons for exclusion are: gene recovery below threshold (34 kbp), hybrid genus or species, taxonomic uncertainty (samples identified only to family level), or exceeding the generic sampling limit when displaced by better genus representatives. In previous releases, the following reasons were also used: Duplicated_sequencing_run: A different sequencing run has been chosen to represent this sample. Overrepresented_genus: the sample was displaced by a higher quality sample in the same genus. Sequence manifest ----------------- This (sequence.manifest.txt) is a tab-delineated file, with columns as follows: 1. Repository name 2. Sequence identifier 3. Sequence type. One of annotated_genome, unannotated_genome, transcript, read 4. Scientific species name 5. Project name. One of PAFTOL, oneKP, GAP. A '-' is used when the sequence has not been generated by a known phylogenetic project. The values of 'Repository name' currently in use are: INSDC: The ENA/GenBank/DDBJ International Nucleotide Sequence Database Collaboration (INSDC) oneKP: The data repository of the One Thousand Plant Transcriptomes Initiative, available here: https://datacommons.cyverse.org/browse/iplant/home/shared/commons_repo/curated/ oneKP_capstone_2019 For samples generated from multiple sequencing runs, a comma-separated list of the corresponding run accessions is used (e.g. ERR16917359_ERR7621247 for Asteriscus intermedius). Revised specimen nomenclature ----------------------------- This (revised_specimen_nomenclature.txt) is a tab delineated file, with columns as follows: 1. Repository_name 2. Sequence_identifier 3. Old species name 4. New species name 5. Release where new name first used Specimen manifest ------------------ This is a tab-delineated file, with columns as follows: 1. Scientific species name 2. Collection ID (of the specimen used); from Index Herbarium 3. Specimen ID or barcode 4. Voucher information 5. Specimen URL (to an online catalogue entry for that specimen, where available) The values of 'Collection ID' currently in use are: A: Harvard University (U.S.A., Massachusetts, Cambridge) AA: Ministry of Ecology and natural resources of the Republic of Kazakhstan (Kazakhstan, Alma-Ata) AAU: Aarhus University (Denmark, Aarhus) ABH: Universidad de Alicante (Spain, Alicante) AD: State Herbarium of South Australia (Australia, South Australia, Adelaide) ALF: CIRAD (France, Montpellier) ALTA: University of Alberta (Canada, Alberta, Edmonton) ALTB: Altai State University (Russia, Barnaul) AMD: Naturalis Biodiversity Center (Netherlands, Leiden) APSC: Austin Peay State University (U.S.A., Tennessee, Clarksville) ARIZ: University of Arizona (U.S.A., Arizona, Tucson) ASC: Northern Arizona University (U.S.A., Arizona, Flagstaff) ASU: Arizona State University (U.S.A., Arizona, Tempe) B: Botanischer Garten und Botanisches Museum Berlin Zentraleinrichtung der Freien Universität Berlin (Germany, Berlin) BA: Museo Argentino de Ciencias Naturales Bernardino Rivadavia (Argentina, Buenos Aires) BC: Institut Botànic de Barcelona (Spain, Barcelona) BCN: University of Barcelona (Spain, Barcelona) BCRU: Universidad Nacional del Comahue (Argentina, Río Negro, San Carlos de Bariloche) BG: University of Bergen (Norway, Bergen) BH: Cornell University (U.S.A., New York, Ithaca) BHCB: Universidade Federal de Minas Gerais (Brazil, Minas Gerais, Belo Horizonte) BISH: Bishop Museum (U.S.A, Hawaii, Honolulu) BJFC: Beijing Forestry University (People's Republic of China, Beijing) BKF: Department of National Parks, Wildlife and Plant Conservation (Thailand, Bangkok, Chatuchak) BM: The Natural History Museum (U.K., England, London) BNRH: Buffelskloof Nature Reserve (South Africa. Mpumalanga Province, Lydenburg) BOL: University of Cape Town (South Africa, Western Cape Province, Cape Town) BONN: University of Bonn (Germany, Bonn) BR: Meise Botanic Garden (Belgium, Meise) BRI: Queensland Herbarium (Australia, Queensland, Brisbane) BRIT: Botanical Research Institute of Texas (U.S.A., Texas, Fort Worth) BRLU: Universite Libre de Bruxelles (Belgium, Bruxelles) BRUN: Brunei Forestry Centre (Brunei Darussalam, Belait) BSB: Freie Universität Berlin (Germany, Berlin) BZ: Herbarium Bogoriense (Indonesia, Java, Bogor) C: University of Copenhagen (Denmark, Copenhagen) CAN: Canadian Museum of Nature (Canada, Quebec, Gatineau) CANB: Australian National Herbarium (Australia, Australian Capital Territory, Canberra) CAS: California Academy of Sciences (U.S.A., California, San Francisco) CAY: Institut de Recherche pour le Developpement (IRD) (French Guiana, Cayenne) CBG: Australian National Herbarium (Australia, Australian Capital Territory, Canberra) CEN: Embrapa Recursos Genéticos e Biotecnologia - Embrapa Cenargen (Brazil, Distrito Federal, Brasília) CEPEC: CEPEC, CEPLAC e Universidade Federal do Sul da Bahia (Brazil, Bahia, Itabuna) CIC: The College of Idaho (U.S.A., Idaho, Caldwell) CM: Carnegie Museum of Natural History (U.S.A., Pennsylvania, Pittsburgh) CMUB: Chiang Mai University (Thailand, Chiang Mai Province, Muang Chiang Mai District) CNS: Australian Tropical Herbarium (Australia, Queensland, Smithfield) COL: Universidad Nacional de Colombia (Colombia, D.C. Bogota) CONC: Universidad de Concepción (Chile, Concepción) CORD: Herbario CORD (Argentina, Córdoba, Córdoba) CR: Museo Nacional de Costa Rica (Costa Rica, San José) CS: Colorado State University (U.S.A., Colorado, Fort Collins) CSH: Shanghai Chenshan Botanical Garden (China, Shanghai, Shanghai) CTBS: Universidade Federal de Santa Catarina (Brazil, Santa Catarina, Curitibannos) CTES: Instituto de Botánica del Nordeste (Argentina, Corrientes, Corrientes) CUMB: Universidade Federal do Pará (Brazil, Pará, Breves) CUVC: Universidad del Valle (Colombia, Valle del Cauca, Cali) DAV: University of California, Davis (U.S.A., California, Davis) DEK: Northern Illinois University (U.S.A., Illinois, Dekalb) " DNA: Department of Environment Parks and Water Security (Australia, Northern Territory, Palmerston) DS: California Academy of Sciences (U.S.A., California, San Francisco) E: Royal Botanic Garden Edinburgh (U.K., Scotland, Edinburgh) EA: National Museums of Kenya (Kenya, Nairobi) EGE: Ege University (Türkiye, Izmir) EIF: Universidad de Chile (Chile, Santiago) F: Field Museum of Natural History (U.S.A., Illinois, Chicago) FHO: University of Oxford (U.K., England, Oxford) FI: Natural History Museum (Italy, Firenze) FLAS: Florida Museum of Natural History (U.S.A., Florida, Gainesville) FMB: Instituto de Investigación de Recursos Biológicos Alexander von Humboldt (Colombia, Villa de Leyva) FR: Senckenberg Gesellschaft für Naturforschung: Senckenberg Forschungsinstitut und Naturmuseum (Germany, Frankfurt) FTG: Fairchild Tropical Botanic Garden (U.S.A., Florida, Miami) FUMH: Ferdowsi University of Mashhad (Iran, Khorassan, Mashhad) G: Conservatoire et Jardin botaniques de la Ville de Genève (Switzerland, Genève) GAZI: Gazi University (Türkiye, Ankara, Ankara) GB: University of Gothenburg (Sweden, Göteborg) GC: University of Ghana (Ghana, Legon) GENT: Ghent University (Belgium, Ghent) GH: Harvard University (U.S.A., Massachusetts, Cambridge) GOET: Universität Göttingen (Germany, Göttingen) GUAY: Universidad de Guayaquil (Ecuador, Guayas, Guayaquil) GZU: Karl-Franzens-Universität Graz (Austria, Graz) HAW: University of Hawaii (U.S.A., Hawaii. Honolulu) HBG: University of Hamburg (Germany, Hamburg) HEID: University of Heidelberg (Germany, Heidelberg) HEM: Universidad de Ciencias y Artes de Chiapas (Mexico, Chiapas, Tuxtla Gutiérrez) HITBC: Xishuangbanna Tropical Botanical Garden, Academia Sinica (People's Republic of China, Yunnan, Xishuangbanna) HKU: The University of Hong Kong (China, Hong Kong) HNG: Université. Gamal Abdel Nasser de Conakry (UGANC) (Republic of Guinea, Conakry) HNU: VNU University of Science, Hanoi (Vietnam, Hanoi) HNWP: Northwest Institute of Plateau Biology, Chinese Academy of Sciences (China, Qinghai, Xining) HO: Tasmanian Museum and Art Gallery (Australia, Tasmania, Hobart) HPUJ: Pontificia Universidad Javeriana (Colombia, D.C., Santafé de Bogotá) HRCB: Universidade Estadual Paulista (Brazil, São Paulo, Rio Claro) HTW: Universidad Nacional de la Patagonia San Juan Bosco - Sede Trelew (Argentina, Chubut, Trelew) HUA: Universidad de Antioquia (Colombia, Antioquia, Medellín) HUAZ: Universidad de la Amazonia (Colombia, Caquetá, Florencia) HUB: Hacettepe University (Turkey, Ankara) HUEFS: Universidade Estadual de Feira de Santana (Brazil, Bahia, Feira de Santana) HUFU: Universidade Federal de Uberlândia (Brazil, Minas Gerais, Uberlândia) HUT: Universidad Nacional de La Trujillo-Trujillo (Peru, Trujillo) HZU: Zhejiang University (China, Zhejiang, Hangzhou) IBSC: South China Botanical Garden (People's Republic of China, Guangdong, Guangzhou) IBUG: Universidad de Guadalajara (Mexico, Jalisco, Zapopan) ICN: Universidade Federal do Rio Grande do Sul (Brazil, Rio Grande do Sul, Porto Alegre) IDS: Idaho State University (U.S.A., Idaho, Pocatello) IEB: Instituto de Ecología, A.C. (Mexico, Michoacán, Pátzcuaro) INB: Instituto Nacional de Biodiversidad (Costa Rica, Santo Domingo) INPA: Instituto Nacional de Pesquisas da Amazônia (Brazil, Amazonas, Manaus) JBB: Jardín Botánico José Celestino Mutis (Colombia, Bogotá, D.C., Bogotá, D.C.) JBL: Jardín Botánico Lankester, Universidad de Costa Rica (Costa Rica, Cartago) JE: Senckenberg Institute for Plant Form and Function at Friedrich Schiller University (Germany, Jena) JEPS: University of California (U.S.A., California, Berkeley) JRAU: University of Johannesburg (South Africa, Gauteng Province, Johannesburg) K: Royal Botanic Gardens (U.K., England, Kew) KAS: University of Kassel (Germany, Kassel) KLU: University of Malaya (Malaysia, Kuala Lumpur) KPBG: Kings Park and Botanic Garden (Australia, Western Australia, Perth) KRB: Kebun Raya Bogor (Indonesia, Bogor) KUN: Kunming Institute of Botany, Chinese Academy of Sciences (People's Republic of China, Yunnan, Kunming) L: Naturalis Biodiversity Center (Netherlands, Leiden) LE: Komarov Botanical Institute of RAS (Russia, Saint Petersburg) LISC: Instituto de Investigação Científica Tropical (Portugal, Lisboa) LP: Museo de La Plata (Argentina, Buenos Aires, La Plata) LPB: Herbario Nacional de Bolivia, Universidad Mayor de San Andrés (Bolivia, La Paz) LYJB: Jardin botanique de la ville de Lyon (France, Lyon) M: Botanische Staatssammlung München (Germany, München) MA: Real Jardín Botánico (Spain, Madrid, Madrid) MAPR: University of Puerto Rico, Mayagüez Campus (Puerto Rico, Puerto Rico, Mayagüez) MAU: The Mauritius Herbarium (Mauritius, Reduit) MBA: Environmental Protection Agency (Australia, Queensland, Mareeba) MBML: Instituto Nacional da Mata Atlântica - INMA (Brazil, Espírito Santo, Santa Teresa) MEDEL: Universidad Nacional de Colombia - Sede de Medellín (Colombia, Antioquia, Medellín) MEL: Royal Botanic Gardens Victoria (Australia, Victoria, Melbourne) MELU: University of Melbourne (Australia, Victoria, Parkville) MEXU: Universidad Nacional Autunoma de Mexico (Mexico, Mexico City) MHA: Main Botanical Garden of the Russian Academy of Sciences (Russia, Moscow) MICH: University of Michigan (U.S.A., Michigan, Ann Arbor) MIN: University of Minnesota (U.S.A., Minnesota, St, Paul) MIR: Shahid Bahonar University of Kerman (Iran, Kerman, Kerman) MJG: Johannes Gutenberg-Universität (Germany, Mainz) MO: Missouri Botanical Garden (U.S.A., Missouri, Saint Louis) MONTU: University of Montana (U.S.A., Montana, Missoula) MPU: Université de Montpellier (France, Montpellier) MSB: Ludwig-Maximilians-Universität (Germany, München) MSUN: Westfälische Wilhelms-Universität (Germany, Münster) MT: Université de Montréal (Canada, Québec, Montréal) MY: Universidad Central de Venezuela (Venezuela, Aragua, Maracay) N: Nanjing University (People's Republic of China, Jiangsu, Nanjing) NBG: South African National Biodiversity Institute (South Africa, Western Cape Province, Cape Town) NCU: University of North Carolina at Chapel Hill (U.S.A., North Carolina, Chapel Hill) NCY: Conservatoire et Jardins Botaniques de Nancy, Université de Nancy I (France, Nancy) NE: University of New England (Australia, New South Wales, Armidale) NH: South African National Biodiversity Institute (South Africa, KwaZulu-Natal Province, Durban) NHM: University of Nottingham (U.K., England, Nottingham) NHMR: Natural History Museum Rijeka (Croatia, Rijeka) NMC: New Mexico State University (U.S.A., New Mexico, Las Cruces) NMNL: Natuurmuseum Nijmegen e.o. (Netherlands, Nijmegen) NOU: Institut de Recherche pour le Développement (IRD) (New Caledonia, Noumea) NSK: Central Siberian Botanical Garden, Siberian Branch of Russian Academy of Sciences (Russia, Novosibirskaya Oblast', Novosibirsk) NSW: Royal Botanic Gardens & Domain Trust (Australia, New South Wales, Sydney) NT: Department of Environment, Parks and Water Security (Australia, Northern Territory, Alice Springs) NU: Bews Herbarium, University of KwaZulu-Natal, Pietermaritzburg campus (Republic of South Africa, KwaZulu-Natal Province, Pietermaritzburg) NY: The New York Botanical Garden (U.S.A., New York, Bronx) OKLA: Oklahoma State University (U.S.A., Oklahoma, Stillwater) ORT: Instituto Canario de Investigaciones Agrarias (ICIA) (Spain, Canary Islands, Puerto de la Cruz) OS: Ohio State University (U.S.A., Ohio, Columbus) OSBU: Universität Osnabrück (Germany, Osnabrück) P: Muséum National d'Histoire Naturelle (France, Paris) PERTH: Western Australian Herbarium (Australia, Western Australia, Perth) PG: Plant Gateway (U.K., Surrey, Kingston-upon-Thames) PH: Academy of Natural Sciences (U.S.A., Pennsylvania, Philadelphia) PRE: South African National Biodiversity Institute (South Africa, Gauteng Province, Pretoria) PTBG: National Tropical Botanical Garden (U.S.A., Hawaii, Kalaheo) QBG: Queen Sirikit Botanic Garden, The Botanical Garden Organization (Thailand, Chiang Mai, Maerim) QCA: Pontificia Universidad Católica del Ecuador (Ecuador, Quito) QCNE: Museo Ecuatoriano de Ciencias Naturales del Instituto Nacional de Biodiversidad (Ecuador, Quito) QRS: CSIRO (Australia, Queensland, Atherton) RB: Jardim Botânico do Rio de Janeiro (Brazil, Rio de Janeiro, Rio de Janeiro) REU: Universite de la Reunion (Reunion, Sainte-Clotilde) RM: University of Wyoming (U.S.A., Wyoming, Laramie) RSA: California Botanic Garden (U.S.A., California, Claremont) S: Swedish Museum of Natural History (Sweden, Stockholm) SALA: Universidad de Salamanca (Spain, Salamanca) SAR: Department of Forestry (Malaysia, Sarawak, Kuching) SEL: Marie Selby Botanical Gardens (U.S.A., Florida, Sarasota) SGO: Museo Nacional de Historia Natural (Chile, Santiago) SHST: Sam Houston State University (U.S.A., Texas, Huntsville) SI: Instituto de Botánica Darwinion (Argentina, Buenos Aires, San Isidro) SING: Singapore Botanic Gardens (Singapore, Singapore, Singapore) SP: Instituto de Botânica (Brazil, São Paulo, São Paulo) SPF: Universidade de São Paulo (Brazil, São Paulo, São Paulo) SPFR: Universidade de São Paulo (Brazil, São Paulo, Ribeirão Preto) SRP: Boise State University (U.S.A., Idaho, Boise) SUVA: University of the South Pacific (Fiji, Suva) TAN: Parc Botanique et Zoologique de Tsimbazaza (PBZT) (Madagascar, Antananarivo) TARI: Research Institute of Forests and Rangelands (Iran, Tehran) TBGT: Tropical Botanic Garden and Research Institute (India, Kerala, Trivandrum) TCD: Trinity College (Ireland, Dublin) TCDL: Trinity College Dublin (Ireland, Dublin) " TEX: University of Texas at Austin (U.S.A., Texas, Austin) TFC: Universidad de La Laguna (Spain, Canary Islands, San Cristóbal de La Laguna, Tenerife) TNS: National Museum of Nature and Science (Japan, Tsukuba) TUH: Tehran University (Iran, Tehran) TUM: Technische Universität München (Germany, Freising) U: Naturalis Biodiversity Center (Netherlands, Leiden) UAPC: University of Alberta (Canada, Alberta, Edmonton) UB: Universidade de Brasília (Brazil, Distrito Federal, Brasília) UBT: University of Bayreuth (Germany, Bayreuth) UC: University of California (U.S.A., California, Berkeley) UDW: University of KwaZulu-Natal, Westville campus (Republic of South Africa, KwaZulu-Natal Province, Durban) UEC: Universidade Estadual de Campinas (Brazil, Campinas) ULS: Universidad de La Serena (Chile, La Serena) UPCB: Universidade Federal do Paraná (Brazil, Paraná, Curitiba) UPOS: Universidad Pablo de Olavide (Spain, Sevilla, Dos Hermanas) UPR: Botanical Garden of the University of Puerto Rico (Puerto Rico, Puerto Rico, Río Piedras) UPS: Museum of Evolution (Sweden, Uppsala) UPTC: Universidad Pedogógica y Tecnológica de Colombia (Colombia, Boyacá, Tunja) US: Smithsonian Institution (U.S.A., District of Columbia, Washington) USJ: Universidad de Costa Rica (Costa Rica, San José, San Pedro de Montes de Oca) USM: Universidad Nacional Mayor de San Marcos (Peru, Lima) UVSC: Utah Valley University (U.S.A., Utah, Orem) VAL: Universitat de València (Spain, València) VEN: Universidad Central de Venezuela (Venezuela, Caracas) W: Naturhistorisches Museum Wien (Austria, Wien) WAG: Naturalis Biodiversity Center (Netherlands, Leiden) WIND: National Botanical Research Institute (Namibia, Windhoek) WIS: University of Wisconsin (U.S.A., Wisconsin, Madison) WS: Washington State University (U.S.A., Washington, Pullman) WTU: University of Washington (U.S.A., Washington, Seattle) WU: Universität Wien (Austria, Wien) YA: National Herbarium of Cameroon (Cameroon, Yaoundé) Z: Universität Zürich (Switzerland, Zürich) ZA: University of Zagreb (Croatia, Zagreb) ZSS: Sukkulenten-Sammlung Zürich (Switzerland, Zürich) Where no information is available, a column contains the text '-'. Gene manifest ------------- This (gene_manifest.txt) is a tab-delineated file, with columns as follows: 1. Gene ID 2. Exemplar gene name 3. Species from which the exemplar gene name has been taken 4. Database name (of the database from which the exemplar gene name was obtained) 5. Record ID (of the database record from which the exemplar gene name was obtained) 6. URL (to the online database record from which the exemplar gene name was obtained) 7. In tree? (values 'Y' or 'N') - indicates whether this gene was used to build the species tree or not. The databases from which exemplar Gene names are taken are currently: UniProtKB: The UniProt Knowledgebase (http://www.uniprot.org) If no suitably named exemplar gene has been found, columns 2 – 6 contain the text ‘-’.