スポンサーリンク
I've tried a few mlops related modules so far.
This time, I will try to install kedro.
contents
スポンサーリンク
abstract
How to install kedro
1.requirement
Jetson Xavier NX
ubuntu18.04
Python 3.6.9
docker
2. install
You can install the following with the command.
pip install kedro
The installation can be verified with the following command.
kedro info _ _ | | _____ __| |_ __ ___ | |/ / _ \/ _` | '__/ _ \ | < __/ (_| | | | (_) | |_|\_\___|\__,_|_| \___/ v0.18.14
3. coding
3.1 Creation of node
Write a function in python and register func,input,output in node
This sample code is as follows
func:return_greeting
|- input:none
|- output:my_salutation
func:join_statements
|- input:greeting
|- output:my_message
from kedro.pipeline import node # Prepare first node def return_greeting(): return "Hello" # Prepare second node def join_statements(greeting): return f"{greeting} Kedro!" return_greeting_node = node(func=return_greeting, inputs=None, outputs="my_salutation") join_statements_node = node( join_statements, inputs="my_salutation", outputs="my_message" )
3.2 Creating a pipline
pipline describes node dependencies and execution order
from kedro.pipeline import pipeline # Assemble nodes into a pipeline greeting_pipeline = pipeline([return_greeting_node, join_statements_node])
3.3 DataCatalog
DataCatalog is a module that supports various data formats.
This time, we simply define it as a box for my_salutation variables.
from kedro.io import DataCatalog, MemoryDataSet # Prepare a data catalog data_catalog = DataCatalog({"my_salutation": MemoryDataSet()})
3.4 runner
Finally, create a runner to execute pipeline
# Create a runner to run the pipeline runner = SequentialRunner() # Run the pipeline print(runner.run(greeting_pipeline, data_catalog))
4.run
4.1 Completed code
"""Contents of hello_kedro.py""" from kedro.io import DataCatalog, MemoryDataSet from kedro.pipeline import node, pipeline from kedro.runner import SequentialRunner # Prepare a data catalog data_catalog = DataCatalog({"my_salutation": MemoryDataSet()}) # Prepare first node def return_greeting(): return "Hello" return_greeting_node = node(return_greeting, inputs=None, outputs="my_salutation") # Prepare second node def join_statements(greeting): return f"{greeting} Kedro!" join_statements_node = node( join_statements, inputs="my_salutation", outputs="my_message" ) # Assemble nodes into a pipeline greeting_pipeline = pipeline([return_greeting_node, join_statements_node]) # Create a runner to run the pipeline runner = SequentialRunner() # Run the pipeline print(runner.run(greeting_pipeline, data_catalog))
4.2 run command
python hello_kedro.py
The execution result will be as follows
INFO Running node: return_greeting(None) -> [my_salutation] INFO Saving data to 'my_salutation' (MemoryDataset)... INFO Completed 1 out of 2 tasks INFO Loading data from 'my_salutation' (MemoryDataset)... INFO Running node: join_statements([my_salutation]) -> [my_message] INFO Saving data to 'my_message' (MemoryDataset)... INFO Completed 2 out of 2 tasks INFO Pipeline execution completed successfully. INFO Loading data from 'my_message' (MemoryDataset)... {'my_message': 'Hello Kedro!'}
スポンサーリンク