Skip to content

Latest commit

 

History

History
670 lines (567 loc) · 15.8 KB

File metadata and controls

670 lines (567 loc) · 15.8 KB

Introduction to Reference Data in Vortexa Python SDK

When working with Vortexa’s Python SDK, one of the foundational components you'll encounter is reference data. But what exactly is reference data, and why is it crucial for your analytics and data manipulation tasks?

What is Reference Data?

Reference data, also known as master data, encompasses the core datasets that define and categorize the entities within Vortexa’s ecosystem. These datasets are used to standardize and provide context to the event data you’ll be working with. In essence, reference data acts as the backbone for ensuring consistency, accuracy, and reliability in your data operations.

Three main Reference Data in Vortexa

1. Geographies: Detailed information about various locations, such as ports, terminals, and regions. This allows you to accurately track and analyze movements and activities related to these locations.

2. Vessels: Comprehensive data about vessels, including their names, types, sizes, and other attributes. This is essential for tracking vessel movements, understanding fleet compositions, and analyzing shipping trends.

3. Products: Information about different types of products, their classifications, and hierarchies. This helps in analyzing trade flows, market dynamics, and commodity-specific trends.

Import Libraries

import vortexasdk as v
import pandas as pd

To start with, let's explore what you could extract from Geographies endpoint.

# To load all geographies
all_geographies = v.Geographies().load_all().to_df()

# To load all geographies with a specific type (e.g. country)
country_list = v.Geographies().search(filter_layer=['country']).to_df()

# To load all geographies with a name containing 'China'
china_list = v.Geographies().search(term='China').to_df()
2024-07-23 13:31:07,189 vortexasdk.client — WARNING — You are using vortexasdk version 0.72.5, however version 0.72.6 is available.
You should consider upgrading via the 'pip install vortexasdk --upgrade' command.
all_geographies.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name layer
0 0eb7b43e3d4e62db74e187bc4eadadfb878a210201e14a... 21st Century Shipbuilding [terminal]
1 b27430b57d617a855d44f91fb70441bae69c19a3c0deb4... 2x1 Holding Cape Midia Shipyard [terminal]
2 d38a8f7bf8ed422b439ad5270be65b60b964bed9568936... A Pobra Do Caraminal [ES] [port]
3 3ebcf6f2e43e3a8b06e9b0ae31df3d87f3fbc8d032b72d... A and P Group Falmouth Shipyard [terminal]
4 0c8a30f40639257e5352e2c6ac52af2f93b2c6bfed7187... A and P Group Tees Shipyard [terminal]
country_list.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name layer
0 1cd3c07221f9e9b3296c859d0bcd3da17ac6072bfdcc84... Afghanistan [country]
1 5e4e7b5040b933b5a0f0d2357ed27ebec432c749b9d63a... Albania [country]
2 87269b28eaea324d2c35e97b0ecc837ebc9a244faf94e2... Algeria [country]
3 f4435d4fffa5b2ba7a340e3a8e7d421f619d9f7832fa0c... American Samoa [country]
4 db3cb74043b8fd438087a3e0e04e3e498b78c3c9790fce... Andorra [country]
china_list.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name layer
0 934c47f36c16a58d68ef5e007e62a23f5f036ee3f3d1f5... China [country]
1 781cacc7033f877caa4b4106d096b74afe006a96391bf5... South China [alternative_region]
2 a63890260e29d859390fd1a23c690181afd4bd152943a0... North China [alternative_region]
3 d1d5a3d3666d6ebb65a1b53e626b07f6b8540e8048a524... Shipyard Nansha China [terminal]
4 ae0f224030f7337d0ffe5a54d290a9f0bd029f636eaf12... China Yangfan Group [terminal]

This shows how you could extract the ids, which may be required from other endpoints. In addition, to extract more information about the locations such as centroids, hierarchies etc, we can do .to_df(columns = 'all) method.

china_list_enhanced = v.Geographies().search(term='China').to_df(columns = 'all')
china_list_enhanced.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name layer leaf parent exclusion_rule ref_type hierarchy pos aliases tags
0 934c47f36c16a58d68ef5e007e62a23f5f036ee3f3d1f5... China [country] False [{'name': 'Far East', 'layer': ['alternative_r... [{'name': 'China', 'layer': ['country'], 'id':... geography [{'label': 'China', 'layer': 'country', 'id': ... [105.4525116754, 35.4496039032] [] {'importProductTags': [], 'exportProductTags':...
1 781cacc7033f877caa4b4106d096b74afe006a96391bf5... South China [alternative_region] False [{'name': 'China (excl. HK & Macau)', 'layer':... [{'name': 'South China', 'layer': ['alternativ... geography [{'id': 'b5fafce6e20de2dc307fb7e0b89978ee91a49... [106.3418341843, 28.1283930445] [] {'importProductTags': [], 'exportProductTags':...
2 a63890260e29d859390fd1a23c690181afd4bd152943a0... North China [alternative_region] False [{'name': 'China (excl. HK & Macau)', 'layer':... [{'name': 'North China', 'layer': ['alternativ... geography [{'id': 'b5fafce6e20de2dc307fb7e0b89978ee91a49... [104.7737174548, 40.9901656588] [] {'importProductTags': [], 'exportProductTags':...
3 d1d5a3d3666d6ebb65a1b53e626b07f6b8540e8048a524... Shipyard Nansha China [terminal] True [{'name': 'Nansha [CN]', 'layer': ['port', 'st... [{'name': 'Shipyard Nansha China', 'layer': ['... geography [{'label': 'Shipyard Nansha China', 'layer': '... [113.5214914215, 22.7496639804] [] {'importProductTags': [], 'exportProductTags':...
4 ae0f224030f7337d0ffe5a54d290a9f0bd029f636eaf12... China Yangfan Group [terminal] True [{'name': 'Zhoushan [CN]', 'layer': ['port', '... [{'name': 'China Yangfan Group', 'layer': ['te... geography [{'label': 'China Yangfan Group', 'layer': 'te... [122.2921859199, 29.9282176371] [] {'importProductTags': [], 'exportProductTags':...

Now we have demonstrated how to extract reference data via Geographies. Similarly, the methodology works for Products & Vessels endpoint as well.

# To load all products
products = v.Products().load_all().to_df()
products.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name layer.0 parent.0.name
0 887940a6cf2d527a20d82a5f163ecce502878ceb1cd59f... 0.005 / 50ppm grade Gasoil
1 8bb096fb847f92af86235002b2a78ca0437543722cdb8c... 0.05 / 500ppm grade Gasoil
2 35f8222ff81fe5befafee9c64c1d76618e4cc53e74021a... 0.1 / 1000ppm grade Gasoil
3 881be476857ff08dcf6a8708a2fc279d26770cecf245f7... 0.1+ / 1000ppm Plus (HS) grade Gasoil
4 a6ef13d2f3145a1b67a81300c1cfa4f21874f24fab4f8f... 180 CST grade High Sulphur Fuel Oil
# To load Gasoil only
gasoil = v.Products().search(term='Gasoil').to_df()
gasoil.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name layer.0 parent.0.name
0 b2034f1ad3a4ac269e962f00b9914d6b909923cf904d99... Gasoil category Diesel/Gasoil
1 deda35eb9ca56b54e74f0ff370423f9a8c61cf6a3796fc... Diesel/Gasoil group_product Clean Petroleum Products
2 e06296595e1d554008a70172440d5582c923bdb8182af5... Coker Gasoil grade Dirty Feedstocks
3 feb8190865392ab6caecd7709077a58645ec4828c23d94... Marine Gasoil grade Gasoil
# To load all vessels
vessels = v.Vessels().load_all().to_df()
vessels.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name imo vessel_class
0 62f3f3c1f5a663d621fe6cf9537c7d936b547497932f5d... \tATHINEA 9291248.0 oil_lr2
1 e6b259c04da30a57db353665e7e61f67a0a3222b96c457... \tBORA 9276004.0 oil_mr2
2 f351708121bce4d357ac5fad967cb1bf7fe5072773f05a... 0051-04 oil_coastal
3 1761da4fb069cd6ce153b6ad1c48e15cdb994eb386e4aa... 058 oil_coastal
4 c817b5994efe14621949533d6777b22ce11db1c6bf9e48... 1011 oil_coastal
# To load all vessels with a name containing 'Maersk' (currently named or previously named)
maersk_vessels = v.Vessels().search(term='Maersk').to_df()
maersk_vessels.head(5)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name imo vessel_class
0 00d89be99f08890c9122c326aa32c83ae6d557629bc124... VS REMLIN 9252307 oil_mr1
1 03c191caf28a554c8c1adfc11ff2a7a08ad5fa21a83892... SOUTH LOYALTY 9537769 oil_vlcc
2 07e562b509f2617c18f9e848c51bdec8e97488c4a41bbc... HENRIETTE MAERSK 9399349 oil_mr1
3 085ab83b592713e314070620a8cb9f4d795a2b486075f4... VS 9252292 oil_mr1
4 09332aaa6067574eac586030b5b81cf5554337c4f48d17... BULL SULAWESI 9180920 oil_lr2

Conclusion

In this tutorial, we covered the essentials of working with reference data using the Vortexa Python SDK. We began by discussing the importance of reference data and its role in supporting accurate and consistent data operations. Through various examples, including locations, vessels, and products, we demonstrated how reference data can be effectively applied in your analyses.

You’ve learned how to query and retrieve reference data based on different criteria such as name or type. Mastering the use of reference data enables you to drive more accurate insights, improve data consistency, and enhance your understanding of the energy markets.