Metadata

Sources of information for MANDALA vary from specific observations by a data entry person (e.g., specimen is pinned) or specialist (e.g., specimen taxon determination) to interpretations (=value added) of observations (e.g., the label says the specimen was collected in Urbana, Illinois, but checking a gazetteer, we find that this means the country of origin is USA, the county is Champaign County, and we can look up the latitude & longitude and add that to the table too [note this is NOT added to the original label data, but to the LOCALITY table]). The mixture of handling and observation described below by specialists and data entry workers is specific to the Therevid PEET project and may differ for other groups using MANDALA.

The tables in Mandala are shown below and the names are linked to explanations of the kinds of data able to be recorded in this database. A more philosophical explanation of Mandala’s structural relationships appears in the 2009 chapter on Mandala and an exhaustive dissection of Mandala’s interior design is provided with the Database Design Report generated for each demo version released (provided as html; XML version can also be generated). There are now only 3 files in Mandala 8 (reduced from 23) containing 27 tables. The relationships amongst the tables and their files is color coded and outlined in the database model (click on table names) below.

Mandala flowchartSPECIMEN: Documents specimens seen and sorted by a specialist; may also include entries from the literature about types that have not been seen by a recent specialist. For the Therevid PEET project, data entry is generally given to an undergraduate student. Specimen identifiers are unique may consist of an institution code, collection code, and catalog number (current concepts of Darwin Core for elements of a GUI, global unique identifier. Note, the Darwin Core is undergoing review as a standard, with ratification anticipated in mid-2009).

Other than a specimen’s unique identifier, the following may be recorded SPECIMEN:

  • a taxon number of the currently recognized taxonomic name (with authority and year)
  • reference to the locality and collection event,
  • sex, type, condition of specimen,
  • verbatim label,
  • how it is preserved, parts dissected
  • at what developmental stage the specimen was collected and what stage is currently in the collection,
  • dates of pupation and eclosion,
  • name of the person creating the record and date and time it was created, name and date and time the record was last modified,
  • and miscellaneous comments.

Other information recorded about specimens may be stored in related tables:

  • determination history
  • loan history
  • extractions, PCR progress, primers used, and sequences attained and deposited in GenBank, with a link to the URL in Genbank.
  • morphological characters
  • relationships between the specimen and other specimens,
  • relationships between the specimen and other taxa (observations), and
  • relationships of the specimen and its environment, including descriptions of behaviors observed.

LOCALITY: To allow for more uniform searching, mapping, and output of collecting label information, political descriptions (link windows may be enlarged) are coupled with those of named geographic features(these may cross political boundaries) in the LOCALITY table. This interpretation of a verbatim label for retrospective data entry often categorizes information beyond that recorded by the collector, through the help of stand-alone electronic-, paper-, and internet-based gazetteers. Although the verbatim label may lack specific information such as county or geographical coordinates (see georeferencing discussion), these may be added to LOCALITY to facilitate mapping of distributions.

Political regions are also automatically linked to a biogeographical region designation (see POLREGNS).

In addition, a displacement calculator has been included to aid in determining the linear displacement from a point (latitude/longitude coordinate) along one of 16 cardinal directions. Support for the recording of alternate coordinate systems, and additional calculators for changing feet to meters (elevation/depth) and miles to kilometers are provided (non-metric measures are assumed to be the convention when units are not expressed in labels originating in the USA and may be a source of error in a limited number of instances).

COLLEVENT: This table records collecting event information that is not locality-related, such as collectors, date(s) of collection, abiotic conditions during collection, and collecting method. Each LOCALITY may have many COLLEVENTs, but each collecting event should have only one LOCALITY associated with it. This combination is then linked to one or more specimens.

DETERMNR: Users can view and enter information into this table without leaving SPECIMEN. It presents a history of determinations (DETERMNR) or identifications of a specimen by an expert. Each determination entry gives the name of the identifier, the year of determination, and the taxonomic name attibuted to the specimen at the time of determination (identification). This information is obtained from labels attached to the pin or, in the case of specimens of groups undergoing revision, it is understood that labels will be added to specimens before being redispersed to museums.

DEPOSIT: A history of loans and deposits (DEPOSIT) for a specimen can also be accessed from within SPECIMEN. This includes the loaning collection or institution, date of the loan, loan number, and the institution or collection to which the specimen was loaned. Loan expirations and requests for extensions can also be entered. Each record shows its status (i.e., most up-to-date location of specimen and its provenance), and a comments field.

One kind of transaction can be accessed entirely from within DEPOSIT, bypassing SPECIMEN, whereby a collection manager can list numbers of various taxa being sent to a borrower; borrowers can also record these transactions for themselves in Mandala.

A third kind of transaction results from sorting of bulk samples to subsamples sent to a specialist for identification and study. These loans have their own forms to accummulate and track multiple subsamples to a borrower under a single loanID.

BULKSAMPLE: This table was added to Mandala with version 5 in May 2001 (as BATCHTXA) and its name changed in 2005. It provides additional flexibility regarding the recording of bulk samples or batches of specimens that may or may not be delineated into taxa or numbers of specimens. It also can be used by collections managers and loan recipients in association with DEPOSIT for tracking loans.

MUSEUMS: Collections and museums referenced in the history of loans and deposits are listed in MUSEUMS. Abbreviations currently in use are the 4-letter codens established in “The insect and spider collections of the world” by Ross H. Arnett, Jr., G. Allan Samuelson, and Gordon M. Nishida, 1993. 2nd ed. Flora & Fauna Handbook no. 11. A web version is available through the Bishop Museum with further updates that may not be in use in our table (if changes from the printed manual could not be justified by us, they were not used, but these differences were noted). If a collection were not represented in Arnett, we erected a unique 4-letter code that did not conflict with other codens detailed there. Some 3-letter codens were erected by us for personal collections but will probably be changed to 4-letters in the future. Also of use to researchers in MUSEUMS are fields for the institution, full address, telephone, fax, www address, email, primary contact information, additional contacts, and comments. Much of this information was also gleaned from Arnett, et. al. 1993.

ORGASSOC: Organisms that were identified and are found associated with collected specimens but they themselves were not collected, can be viewed in SPECIMEN via a portal to ORGASSOC. These data are usually found on a label attached to the preserved specimen. Organisms are listed by taxonID from the TAXA table. Because our fly projects are primarily a systematic studies of single insect families, the information that is researched and recorded about taxonomic names of plants or other organisms is usually that which is found on the label and may unfortunately be outdated. In some cases the taxon resolution may only be to family or order. The distinction between taxonomic information in ORGASSOC and that recorded in ASSOCSPM is that the latter table contains verifiable associations, because the associated specimens have been collected, given a specimen number, and the association has a definite interactive categorization component (e.g., mating, mimic/model, predator/prey, etc.).

ASSOCSPM: Specimens associated (ASSOCSPM) with other specimens can also be listed and viewed from within SPECIMEN. The associated specimen’s identifer, its taxon, type of association (e.g., predator, prey, mate, etc.) can be listed. This information is often recorded on the label, but in the case of a male and female of the same species mounted on the same pin, the assumption of mating pairs was made unless the collector were known to “conserve pins.”

BIOECOBEH: Descriptions that do not involve specific taxa are featured here (formerly BIOASSOC). This table contains actions, interactions, and descriptions of the environment with a specimen at the time of capture. Four fragments (action, conjunction, adjective, object) can comprise a single phrase. The phrases are easily constructed from fragment pop-up lists and these lists can be edited by the user to suit the specifications of the taxa under study. Information coded here comes from journal entries or the verbatim collecting label and is recorded in SPECIMEN in its original form. Information may be encoded in BIOECOBEH from a portal in SPECIMEN and, because it concerns the actions of a specimen, is more closely associated with the specimen. Controlled language tables have the advantage of facilitating searching and categorizing.

TAXA: Taxonomic names, their status, and history are delineated in TAXA (formerly NAMES). A record for each taxonomic name used in Mandala, whether valid, invalid, or unregulated, has been created, including working names, those from manuscripts, or those in press. Records in TAXA are created for individual family-group and genus-group names, as well as names above the family group, e.g., Diptera and Animalia, not regulated by the International Code of Zoological Nomenclature. Records for species-group names are created as binomial (genus and species) or trinomial (genus, species, and subspecies) name records, which allows relationships among species and their usage to be clearly and unambiguously displayed in Mandala.

The most complete information is available for taxonomic groups being actively studied, so in the case of the Therevid PEET version of Mandala, therevid names are the highest priority for establishing status and history, while other taxa are entered as found on labels but have not been as thoroughly researched for their validity and history. Names of other organisms may be found from various sources on-line and links are included in TAXA, including ITIS.

Name status and history are documented by specialists by looking at primary literature and catalogs. Standards adhere to those established by Thompson 1997 (pdf). Each name establishment, change, or conflict may be documented by a corresponding link to the appropriate literature in LITERATURE. The TAXA table also records primary type information, connecting type species to a specimen in SPECIMEN and the biogeographic region and type location in which it was collected. The type layout also contains information on the range of biogeographic regions from which a species has been reported. This information is gleaned primarily from published catalogs where available and from primary literature, but can also be inferred from specimens entered in MANDALA.

PEOPLE: A separate table, PEOPLE, is maintained for names of individuals and groups treated as individuals (e.g., expeditions). These may be authors of literature, collectors of specimens, authorities of taxa, determiners of specimens, borrowers or lenders of specimens, illustrators, or holders of copyrights on illustrations. Keyed to a unique number for each name entered, the record should contain a last (family) name, first name and middle initial or name, and initials of first and middle names. Contact information may be entered as well as birth and death years or range of years that a person was active in the field. To help further distinguish similar names or locate specialists, a field has been added so that the person’s specialty (e.g., Diptera, aquatic invertebrates, higher plants, etc.) can be recorded. A field for an image now also accompanies this table in Mandala 8.

Unfortunately, we do not have complete information at this time for most of the people represented in this table. Also, currently more than one record may exist for a single person if it cannot be determined that more than one name record represents the same person (i.e., the last name may be the same, but the label lacks initials or other specifications that both records are the same, so you might see a record for “Cole”, another for “F. Cole,” and a third for “F. R. Cole”). We are attempting to “clean up” these apparent duplicates as time permits, but this requires considerable research. For now, we identify senior synonyms for a name and associate junior synonyms (less complete forms) with the senior synonym, which would allow linking to the more complete record if desired. The entomologists need to catch up with the botanists in establishing authority files for entomological workers!

PEO_JOIN: To allow the greatest flexibility of display of names listed in PEOPLE, a join table (PEO_JOIN) links the unique codes of PEOPLE, to those in a second table in which this person has done something (author of literature in LITERATURE; collector of specimens in COLLEVENT; authority of a taxon in TAXA), and an ordering of the names. This table enables a name to be displayed as “Cole”, “F. R. Cole,” “Frank R. Cole,” “Cole, F. R.”, or “Cole, Frank R.” as well as multiple names to be listed in order.

ILLUS: Information about images is documented in ILLUS. For the therevid PEET project, smaller images are included in the database, but it is possible to provide a link to MorphBank or other server locations for a full-sized image. New in Mandala 8 (April 2009) this link is also now able to be viewed from within Mandala via a Web Viewer tab. The information documented in this table includes links to SPECIMEN, LITERATURE, TAXA, LOCALITY, and LITMINING. Descriptions of the method, medium, and subject of the illustration can all be tracked here, including information on where the original illustration is housed and the archival location of a digital image on hard media (disk, CD/DVD) and its URL on the web. The name of the illustrator (connected to PEOPLE), the holder of the copyright for the illustration, and the year of the illustration or its publication are also listed. Comparisons of similarly named features across taxa are provided.

LITERATURE: The LITERATURE table contains fields for title (book and article or chapter), title translation(s), primary language of the publication, type of publication, authors (currently limited to 15), editors, a reference to journal names and abbreviations in JOURNALS, bibliographic citation fields such as series, edition, volume, number, page range, number of pages in publication, plates, publisher, and city of publication, keywords, and comments fields. The dates associated with taxonomic literature are particularly important in determining names that have priority. Therefore there is a field for the imprint year as well as fields for month, day, and year for the issue date, which takes priority over the imprint year in disputes (imprint year assumes that a name was published on December 31st of that year).

New in Mandala 8 is a feature that allows building a bibliographic export field that can be exported to such applications as EndNote®

JOURNALS: A reference list in JOURNALS by unique journalID saves typing out journal names more than once and gives the flexibility of listing a journal title fully or by its abbreviation. Journal abbreviations are based on “Abbreviated Titles of Biological Journals,” compiled by P. C. Williams for The Biological Council (London) 1969 (3rd edition), a subset of “The World List of Scientific Periodicals.” Abbreviations should be reexamined at the time of publication to make sure that they are still currently acceptable for the journal in which they are being published. If in doubt, use the full journal title. If only an abbreviated title exists in this table, then the full title was unable to be determined. It is intended that this table be brought up to date to the latest standard in a future version.

LITMINING: This unique join table draws together elements of the literature with specimens, collecting localities, museums or collections where specimens are housed, and illustrations, based on an organizing principle designating the type of information recorded in the literature. This allows the linkage with literature of existing specimens with unique identifiers and creation of LIT IDs for specimens documented in the literature but never actually seen by a researcher working on a group. With Mandala 8, a field now exists for related extracted text to accompany the citations. This simplifies the gathering of information for potential species pages, and when the literature citation is out of copyright or open access, the extracted text may be cited verbatim with the appropriate attribution.

Identification of specimens from the literature may be changed as the corresponding specimens are located and examined by a specialist, but the linkages remain with the literature. In this manner, once the literature is thoroughly mined for these links, users will be able to assertain the types of information contained in a piece of literature (are there original descriptions? taxonomic revisions? listing of specimens? illustrations? behaviors to record? etc.). This table is unlikely to be of top priority to many users concerned primarily with specimen data input or distribution of subsamples to specialists. It takes some skill to dissect the literature for this table, but scripted buttons exist to aid in this process in TAXA, SPECIMEN, ILLUS, LOCALITY, and BIOECOBEH. Others can and likely will be added as the need for additional organizing principles are realized.

Reference & Utility Tables in Mandala

POLREGNS: From information listed in LOCALITY, a reference table, POLREGNS, automatically categorizes the biogeographical region as one of seven: Palearctic, Oriental, Nearctic, Neotropical, Afrotropical, Australasian, and Antarctic. If changes are needed or further definitions desired (countries added, names changed) the changes made in POLREGNS will then be reflected in all records related to this table.

NAVIGATION: This entry point is where users sign in and it serves as the primary navigational hub for Mandala for data entry and viewing, constructing reports, troubleshooting and finding help, including a glossary of icons and database terms. It can be accessed from any layout via the icon above. The table also includes scripts that are of use to the database administrator for quickly making Mandala files single or multiuser and can be customized for various user interfaces.

ENTRYQ: Mandala enables an electronic tracking of questions or problems and their resolution via ENTRYQ. Links back to the table concerned are provided upon creation of a record. The question is entered in one field and its resolution in another. When a question has been resolved and the situation corrected, this can be indicated so that all unresolved questions may be found and addressed. This table enables the answers to questions to be tracked over time so that the resolution may be verified even for records that no longer exist. For example, taxonomic name records that are deleted (they are considered duplicates or not in use) are first checked to see what other records in other tables may have used this taxon number and these records are changed accordingly. However before deleting this name record, an entry is created in ENTRYQ to document what the record represented and why it was deleted in case something was missed. It is very difficult to track down “naked” key codes (codes without corresponding data).

CHANGES: In May 1999, the CHANGES table was added to Mandala to track structural changes made and needed, and report bugs and their extermination. The table is comprised primarily of the two text fields for changes made and needed, a checklist of tables that are affected by the changes, a checklist of types of changes being made (to facilitate searches for scripting, layout, or field definition changes, for example) and a checklist of versions (server, demo, web, working) of Mandala that have been changed. It also includes a script that will generate an email announcing changes recorded in this table. This table will allow people upgrading to newer versions of Mandala to see changes made and to document their own changes relevant to their organisms of study.

HELP: Context sensitive field and table help are part of Mandala’s native HELP table; if you are in a table with no field selected, clicking on the icon with a big question mark gives a list of help topics specific for the table you were in; with the cursor in a field, clicking on that icon tells you what is expected in the field and often the format desired.