Handling Dates In Analytics Projects

In a real data science or AI project, date is an important part. Whether it is sales of a product every month or profit that a company makes over last quarters or stock price movement of a company on a daily basis, the date is important while capturing the data, storing it as well as analysing it. In this article let us discuss how date is treated in R language and various functions associated with it in the context of running analytics on dates.

Date function in R

Dates can be in many formats – typically it is read in string format or as characters. The standard date format is “YYYY-MM-DD.” For converting the string date values into actual dates in R programming language, the as.Date() function is used. Internally, Date objects are stored as the number of days since January 1, 1970 (for earlier dates negative numbers is used).

The following functions are useful in the context of dates :

To get the current system date, we can use the Sys.Date() function.

> Sys.Date()
[1] “2021-11-03”

Sys.timezone() function allows us to get the timezone of the system.
Sys.time() function allows us to get the current system date, time, with timezone in a single function.

Storing dates in R

By default R treats date values as character values while importing a dataset. For example, when a data file is imported into R, the values in date column may look like “3 Nov 2021” or 2021/11/03. These values are treated as Character values in R. R must be instructed 2 things – firstly, that these values are dates and secondly, which part of the format is Day, Month and Year.

R will convert these values to class Date now. So the date values will now be stored as numbers i.e, the number of days from its “origin” date 1 Jan 1970. As date is now stored as numbers, it will be treated as continuous variables. This will enable calculating distance between 2 dates.

Working with Dates in R

Let us execute the following line of codes in R and see how useful information can be extracted using various date functions.

d <- as.Date(“2021-10-14”)

> d
[1] “2021-10-14”

> months(d)
[1] “October”

> weekdays(d)
[1] “Thursday”

> quarters(d)
[1] “Q4”

> e <- as.Date(“2021-10-24”)
> e-d
Time difference of 10 days

The lubridate package in R has various functions that allow us to work with dates more efficiently and can be checked out.

Handling Dates in Python

In python, a module named datetime can be imported to work with date and time values as date objects. This module has many relevant functions. For example, the time class is used to represent time values while the date class is used to represent calendar date values.

Facebook
Twitter
Pinterest
Email