set_datetime#

One of the functions you can use to check certain columns of your data is set_datetime(). This function aims to check that you have the following Darwin Core Vocabulary Terms:

  • eventDate: the date of your observation

It can also (optionally) can check the following:

  • eventTime: year of your observation

  • year: year of your observation

  • month: year of your observation

  • day: year of your observation

>>> import corella
>>> import pandas as pd
>>> events = pd.read_csv('<YOUR-FILENAME>.csv')

eventDate with Events#

As eventDate occurs in both your occurrences data, as well as the events data, you can use the function set_datetime() for both. For events, however, you have to set the argument check_events to True. In addition, seince we konw that we have dates, we can specify the eventDate column to be 'date'.

>>> my_dwca.set_datetime(check_events=True,
...                      eventDate='date')
Traceback (most recent call last):
  File "/Users/buy003/Documents/GitHub/galaxias-python/docs/source/galaxias_user_guide/longitudinal_studies/events_workflow.py", line 76, in <module>
    my_dwca.set_datetime(check_events=True,eventDate='date')
  File "/Users/buy003/anaconda3/envs/galaxias-dev/lib/python3.11/site-packages/galaxias/dwca_build.py", line 559, in set_datetime
    self.events = corella.set_datetime(dataframe=self.events,eventDate=eventDate,year=year,month=month,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/buy003/anaconda3/envs/galaxias-dev/lib/python3.11/site-packages/corella/set_datetime.py", line 101, in set_datetime
    raise ValueError("There are some errors in your data.  They are as follows:\n\n{}".format('\n'.join(errors)))
ValueError: There are some errors in your data.  They are as follows:

the eventDate column must be in datetime format.

We get an error here because set_datetime() requires the eventDate column to be in a datetime format. This is to make sure the date is formatted correctly. Luckily, set_datetime() has a few arguments that will convert dates in strings to datetime format.

  • string_to_datetime: when this is set to True, will convert any strings in the eventDate column to datetime objects.

  • yearfirst: when this is set to True, corella (and pandas) assumes your date starts with the year.

  • dayfirst: when this is set to True, corella (and pandas) assumes your date starts with the day.

Note when both yearfirst and dayfirst are set to False, pandas assumes month is first.

>>> my_dwca.set_datetime(dataframe=events,
...                      eventDate='date',
...                      string_to_datetime=True,
...                      yearfirst=False,
...                      dayfirst=True)
        type    location  eventDate                                             name
0  siteVisit  Cannonvale 2023-01-03  bird survey local park honeyeater lookout point
1  siteVisit  Cannonvale 2023-01-17  bird survey local park honeyeater lookout point
2  siteVisit  Cannonvale 2023-01-31  bird survey local park honeyeater lookout point
3  siteVisit  Cannonvale 2023-02-14  bird survey local park honeyeater lookout point
4  siteVisit  Cannonvale 2023-02-28  bird survey local park honeyeater lookout point

what does check_data and suggest_workflow say now?#

Note: each of the set_* functions checks your data for compliance with the Darwin core standard, but it’s always good to double-check your data.

Now, we can check that our data column do comply with the Darwin Core standard.

>>> my_dwca.check_data(events=events)
  Number of Errors  Pass/Fail    Column name
------------------  -----------  -------------
                 1  ✗            eventDate


══ Results ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════


Errors: 1 | Passes: 0

✗ Data does not meet minimum Darwin core requirements
Use corella.suggest_workflow()

── Error in eventDate ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

eventDate is a required field. Please ensure it is in your dataframe
eventDate is a required field. Please ensure it is in your dataframe

However, since we don’t have all of the required columns, we can run suggest_workflow() again to see how our data is doing this time round.

>>> my_dwca.suggest_workflow(dataframe=events)
── Darwin Core terms ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── All DwC terms ──

Matched 1 of 9 column names to DwC terms:

✓ Matched: eventDate
✗ Unmatched: Longitude, Species, Collection_date, number_birds, Latitude, name, location, type

── Minimum required DwC terms occurrences ──

Type                       Matched term(s)    Missing term(s)
-------------------------  -----------------  ------------------------------------------------
Identifier (at least one)  -                  occurrenceID OR catalogNumber OR recordNumber
Record type                -                  basisOfRecord
Scientific name            -                  scientificName
Location                   -                  decimalLatitude, decimalLongitude, geodeticDatum
Date/Time                  -                  eventDate
Associated event ID        -                  eventID

── Minimum required DwC terms events ──

Type                   Matched term(s)    Missing term(s)
---------------------  -----------------  -----------------
Identifier             -                  eventID
Linking identifier     -                  parentEventID
Type of Event          -                  eventType
Name of Event          -                  Event
How data was acquired  -                  samplingProtocol
Date of Event          eventDate          -

── Suggested workflow ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Occurrences ──

To make your occurrences Darwin Core compliant, use the following workflow:

corella.set_occurrences()
corella.set_scientific_name()
corella.set_coordinates()
corella.set_datetime()

Additional functions: set_abundance(), set_collection(), set_individual_traits(), set_license(), set_locality(), set_taxonomy()

── Events ──

To make your events Darwin Core compliant, use the following workflow:

corella.set_events()

Other functions#

To learn more about how to use these functions, go to

Optional functions:

Passing Dataset: