set_datetime#
One of the functions you can use to check certain columns of your data is set_datetime()
.
This function aims to check that you have the following Darwin Core Vocabulary Terms:
eventDate
: the date of your observation
It can also (optionally) can check the following:
eventTime
: year of your observationyear
: year of your observationmonth
: year of your observationday
: year of your observation
>>> import corella
>>> import pandas as pd
>>> events = pd.read_csv('<YOUR-FILENAME>.csv')
eventDate
with Events#
As eventDate
occurs in both your occurrences
data, as well as the events
data, you can
use the function set_datetime()
for both. For events, however, you have to set the argument
check_events
to True
. In addition, seince we konw that we have dates, we can specify the
eventDate
column to be 'date'
.
>>> my_dwca.set_datetime(check_events=True,
... eventDate='date')
Traceback (most recent call last):
File "/Users/buy003/Documents/GitHub/galaxias-python/docs/source/galaxias_user_guide/longitudinal_studies/events_workflow.py", line 76, in <module>
my_dwca.set_datetime(check_events=True,eventDate='date')
File "/Users/buy003/anaconda3/envs/galaxias-dev/lib/python3.11/site-packages/galaxias/dwca_build.py", line 559, in set_datetime
self.events = corella.set_datetime(dataframe=self.events,eventDate=eventDate,year=year,month=month,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/buy003/anaconda3/envs/galaxias-dev/lib/python3.11/site-packages/corella/set_datetime.py", line 101, in set_datetime
raise ValueError("There are some errors in your data. They are as follows:\n\n{}".format('\n'.join(errors)))
ValueError: There are some errors in your data. They are as follows:
the eventDate column must be in datetime format.
We get an error here because set_datetime()
requires the eventDate
column to be in a datetime
format. This is to make sure the date is formatted correctly. Luckily, set_datetime()
has a few
arguments that will convert dates in strings to datetime
format.
string_to_datetime
: when this is set toTrue
, will convert any strings in theeventDate
column todatetime
objects.yearfirst
: when this is set toTrue
,corella
(andpandas
) assumes your date starts with the year.dayfirst
: when this is set toTrue
,corella
(andpandas
) assumes your date starts with the day.
Note when both yearfirst
and dayfirst
are set to False
, pandas
assumes month is first.
>>> my_dwca.set_datetime(dataframe=events,
... eventDate='date',
... string_to_datetime=True,
... yearfirst=False,
... dayfirst=True)
type location eventDate name
0 siteVisit Cannonvale 2023-01-03 bird survey local park honeyeater lookout point
1 siteVisit Cannonvale 2023-01-17 bird survey local park honeyeater lookout point
2 siteVisit Cannonvale 2023-01-31 bird survey local park honeyeater lookout point
3 siteVisit Cannonvale 2023-02-14 bird survey local park honeyeater lookout point
4 siteVisit Cannonvale 2023-02-28 bird survey local park honeyeater lookout point
what does check_data
and suggest_workflow
say now?#
Note: each of the set_*
functions checks your data for compliance with the
Darwin core standard, but it’s always good to double-check your data.
Now, we can check that our data column do comply with the Darwin Core standard.
>>> my_dwca.check_data(events=events)
Number of Errors Pass/Fail Column name
------------------ ----------- -------------
1 ✗ eventDate
══ Results ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Errors: 1 | Passes: 0
✗ Data does not meet minimum Darwin core requirements
Use corella.suggest_workflow()
── Error in eventDate ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
eventDate is a required field. Please ensure it is in your dataframe
eventDate is a required field. Please ensure it is in your dataframe
However, since we don’t have all of the required columns, we can run suggest_workflow()
again to see how our data is doing this time round.
>>> my_dwca.suggest_workflow(dataframe=events)
── Darwin Core terms ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
── All DwC terms ──
Matched 1 of 9 column names to DwC terms:
✓ Matched: eventDate
✗ Unmatched: Longitude, Species, Collection_date, number_birds, Latitude, name, location, type
── Minimum required DwC terms occurrences ──
Type Matched term(s) Missing term(s)
------------------------- ----------------- ------------------------------------------------
Identifier (at least one) - occurrenceID OR catalogNumber OR recordNumber
Record type - basisOfRecord
Scientific name - scientificName
Location - decimalLatitude, decimalLongitude, geodeticDatum
Date/Time - eventDate
Associated event ID - eventID
── Minimum required DwC terms events ──
Type Matched term(s) Missing term(s)
--------------------- ----------------- -----------------
Identifier - eventID
Linking identifier - parentEventID
Type of Event - eventType
Name of Event - Event
How data was acquired - samplingProtocol
Date of Event eventDate -
── Suggested workflow ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
── Occurrences ──
To make your occurrences Darwin Core compliant, use the following workflow:
corella.set_occurrences()
corella.set_scientific_name()
corella.set_coordinates()
corella.set_datetime()
Additional functions: set_abundance(), set_collection(), set_individual_traits(), set_license(), set_locality(), set_taxonomy()
── Events ──
To make your events Darwin Core compliant, use the following workflow:
corella.set_events()
Other functions#
To learn more about how to use these functions, go to
Optional functions:
Passing Dataset: