Working with the Date and Time Component
In data analysis, the task of breaking down a date and time into its components—year, month, day, hour, minutes, day of the week, and others—regularly arises. This is because, for finding patterns, it is not the date itself that is important, but specific information about it.
For example, to optimize the operation of an online store, it may be necessary to identify which day of the week has peak sales. Obviously, for this, you need to understand which day of the week the analyzed dates correspond to.
It's not difficult to determine that September 21, 2023, is a Thursday, but is a user able to do this without a calendar? And how much time would it take to analyze a million such records?
It is difficult for people to determine the day of the week or the week number without a calendar, as there is no corresponding rule. For example, January 1st could be a Saturday, and February 1st could be a Wednesday.
Likewise, determining the week number from a date requires effort, as week numbering systems differ depending on the chosen standard. For example, 01.01.2023 in the Gregorian calendar belongs to the 1st week of 2023, but in the ISO 8601 standard, it belongs to the 52nd week of the previous year.
Can a user quickly determine that October 4, 2023, is the 277th day of the year? Or, conversely, say that the 222nd day of 2023 corresponds to August 10th?
Date/time data, from which its parts have been extracted, allows for more useful results. For example, it can be aggregated to analyze monthly sales, quarterly profits, or to see how data changes in different periods.
Storing date and time in IT systems
Users are accustomed to writing dates as 22-03-15 or 22.03.2015, but within information systems, they are usually stored as numbers. These are numerical values that represent the number of time units that have passed since a fixed point in time.
This makes the data easier to process. In addition, such a representation takes up less space on the disk and in RAM. It provides a simple and uniform representation of date and time.
For example, UNIX time or POSIX time is often used for storage. This method of encoding time is the number of seconds that have passed since the beginning of the Unix epoch—January 1, 1970, 00:00:00 UTC.
Using UNIX time simplifies date comparison and the calculation of time intervals through simple arithmetic operations. For example, 2023-09-26 09:00:00 when converted to Unix time will correspond to the number 1695708000.
At the same time, for ease of perception by people, dates are converted back into the usual text representation when displayed on the screen.
The ISO 8601 standard
The representation of date and time in the Megaladata platform is divided into a standard format and the ISO 8601 standard. The first does not cause difficulties, but the second option requires explanation.
ISO 8601 is an international standard that describes the date/time format and provides recommendations for its use. A date in this standard consists of three parts: year, week number (1–53), and day of the week (1–7). A special feature is that the first week of the year is considered to be the one that contains the first Thursday.
The week begins on Monday and ends on Sunday. Thus, the ISO year can begin 1–3 days earlier or later than the Gregorian calendar year.
This standard is mainly used in government institutions and the business sector. It ensures the unification of time representation, which is important for conducting operations, especially on the global market.
Dates according to ISO 8601
In this example, January 1st belongs to the previous year when using the ISO 8601 standard. Since January 1, 2017, fell on a Sunday, the first week of 2017 began on Monday, that is, January 2nd.
The Date and Time component in Megaladata
Megaladata has a special Date and Time component that provides convenient mechanisms for working with fields of this type.
It allows for the extraction of the following derivatives from a field:
- year and quarter
- year and month
- year and week
- year and day
- year, quarter, month, and week
- day of the year, quarter, month, or week
- hours, minutes, seconds, milliseconds
- date
- time
In addition, the date/time can be converted to a custom format. It is set using a combination of arbitrary text and predefined markers
Representing dates using predefined markers in Megaladata
For example, the date 03.22.2023 can be converted to the form 2023-03-22.
Let's explore the functionality of the Date and Time component.
Working with the Date and Time component
Below are several practical cases that will require the use of this component. The described actions are only permissible for fields with the Date/Time data type.
Case 1
There is a database of three pharmacies for several years, which includes the fields: date of sale, product name, cost, quantity. The task is to check on which day of the week the revenue is minimal.
Possible solution
You can find out which day of the week corresponds to each date and calculate the profit. Then, group the data by days of the week.
Working in Megaladata
Node configuration: splitting the date into year and day of the week
The node configuration gives us the following result:
Splitting the date into days of the week
A chart built on the prepared data makes it easy to get the answer to the task question.
Graphical representation of data for one year
Case 2
Find out which week in the last few years was the most profitable.
Possible solution
You can match the date with the week number, perform grouping, and visualize the data.
Working in Megaladata
If you group by date without pre-processing the fields, the output will be sales by day, not by week. This step doesn't achieve the task goal.
Visualization of the initial data
In the Date and Time component, when configuring the node, you need to break the date down into year and week, finding their first and last days.
Node configuration: Splitting the date into year and week
It is interesting that for the Date 01.01.2017, the first and last day of the week coincide, as January 1st fell on a Sunday.
Splitting the date into year and week—first and last day
Therefore, converting the date in line with ISO 8601 gives a different result.
Dates according to the ISO 8601 standard
You can compare the data of different weeks visually using a chart.
Data visualization
Case 3
Analyzing sales volume and revenue by month, quarter, and year. The data includes the fields: date of sale, number of customers, sales volume, and revenue. The database contains information for several years.
Initial data
Possible solution
Data analysis should be focused on the available information and context. For example, if there is data for only one year, then analyzing it by year does not make sense, as there are no other points of comparison.
Each date must be matched with the selected interval, grouped, and visualized.
Working in Megaladata
Option 1
When configuring the node, you need to split the date into year and month, finding the first and last day of the month.
Node configuration: Splitting the date into year and month—first and last day
Next, grouping the data by month. The data visualization will look like this:
Data visualization by month
Option 2
Add the year and the first and last day of the quarter to the source data.
The result is grouped by quarter and visualized:
Data visualization by quarter
Option 3
When configuring the node, the first and last day of the year are extracted from the date:
Node configuration: Splitting the date into year—first and last day
Then, as in the previous examples, grouping and visualization take place:
Data visualization by year
Determining the right time intervals for analysis always depends on the tasks that need to be solved. To identify trends and changes, you need to have at least a few time periods for comparison. At the same time, it is sometimes useful to combine the analysis of data covering different periods to get a more complete picture.
The presented cases are not a benchmark, but rather demonstrate example ways of solving common problems.
In conclusion
The Date and Time component of the Megaladata platform provides the ability to flexibly work with Date/Time data without the need to write code in a programming language:
- break down date/time into its constituent parts
- add the beginning and end of a period
- determine the day of the week, month, quarter, or year
- carry out all operations in accordance with the ISO 8601 standard
- convert a field to text with a specified format.
Breaking down a date into different time intervals makes it easier to find patterns and visualize data. Fields processed in this way are also easier to compare.
Read more:
See also