set_collection#
Note
Collection information is not required by the ALA, but it is nice to have.
One of the functions you can use to check your data is set_collection()
.
This function aims to check if you have the following:
datasetID
: An identifier for the set of data. May be a global unique identifier or an identifier specific to a collection or institution.datasetName
: The name identifying the data set from which the record was derived.catalogNumber
: A unique identifier for the record within the data set or collection.
Specifying Collection Information#
If your observation is part of a collection, adding additional information on the specimen so others
can properly reference it is straightforward. All the arguments above take either a str
(denoting a column name
or a higher taxon) or a list
. An example of this is below:
>>> import pandas as pd
>>> occ = pd.DataFrame({'scientificName': ['Eolophus roseicapilla','Eolophus roseicapilla']})
>>> my_dwca = galaxias.dwca(occurrences=occ)
>>> my_dwca.set_collection(dataframe=occ,datasetID='b15d4952-7d20-46f1-8a3e-556a512b04c5',
... datasetName='Lacey Ctenomys Recaptures',catalogNumber='2008.1334')
>>> my_dwca.occurrences.head()
scientificName datasetID datasetName catalogNumber
0 Eolophus roseicapilla b15d4952-7d20-46f1-8a3e-556a512b04c5 Lacey Ctenomys Recaptures 2008.1334
1 Eolophus roseicapilla b15d4952-7d20-46f1-8a3e-556a512b04c5 Lacey Ctenomys Recaptures 2008.1334
Other functions#
To learn more about how to use other functions, go to
Optional functions:
Creating Unique IDs:
Passing Dataset: