Home » Distributed Systems: Spark to Process Crypto Transactions and Identify Trading Sessions
Pyspark

Distributed Systems: Spark to Process Crypto Transactions and Identify Trading Sessions

This article dives into how Spark can be leveraged to process crypto transaction data efficiently as a distributed systems library. Specifically, we’ll demonstrate how to use Spark to analyze Bitcoin transactions, read large datasets, and calculate rolling metrics like average transaction value or total volume using window functions. By combining Spark’s SQL-like DataFrame API with functional programming, you can unlock new possibilities for data analysis in a distributed environment.

Whether you’re tracking wallet activity, identifying market trends, or developing trading algorithms, Spark’s robust architecture and scalability make it an indispensable tool for working with cryptocurrency data. Read on to see how Spark can transform raw blockchain data into actionable insights, all while handling the complexity of distributed systems.

How would you identify trading sessions coming from crypto wallets and do it with the maximum efficiency using a distributed systems approach?

PySpark: a Distributed Systems Library

1.Generate the Input of the Program

This script generates a synthetic dataset of Bitcoin transactions for distributed system testing. It defines wallet addresses and uses a base timestamp to simulate transaction times. 

import pandas as pd
import random
import uuid
from datetime import datetime, timedelta


def generate_transactions(num_wallets=5, transactions_per_wallet=20,
                          session_gap_seconds=60):
    """
    Generate a DataFrame of transactions with multiple wallets and sessions.

    Args:
        num_wallets (int): Number of unique wallets.
        transactions_per_wallet (int): Number of transactions per wallet.
        session_gap_seconds (int): Minimum time gap between sessions.

    Returns:
        pd.DataFrame: A DataFrame containing synthetic transactions.
    """
    wallets = [f"wallet_{i}" for i in range(1, num_wallets + 1)]
    transactions = []

    for wallet in wallets:
        current_time = datetime.now()
        for _ in range(transactions_per_wallet):
            # Randomly decide whether to stay in the same session or start a
            # new one
            if random.random() < 0.7:  # 70% chance to stay in the same session
                current_time += timedelta(
                    seconds=random.randint(1, session_gap_seconds - 1))
            else:  # 30% chance to start a new session
                current_time += timedelta(
                    seconds=random.randint(session_gap_seconds + 1,
                                           session_gap_seconds * 5))

            transaction_id = str(
                uuid.uuid4())  # Generate a UUID4 for each transaction
            transaction_value = round(random.uniform(0.01, 100), 2)
            transactions.append({
                "transaction_id": transaction_id,
                "timestamp": current_time,
                "wallet_address": wallet,
                "transaction_value": transaction_value
            })

    # Shuffle the transactions
    random.shuffle(transactions)

    return pd.DataFrame(transactions)


# Generate the dataset
df = generate_transactions()

# Save the shuffled transactions as a CSV to match the PySpark input format
file_path = "transactions.csv"
df.to_csv(file_path, index=False)

# Display the first few rows of the shuffled dataset
print(df.head())

Randomized values are generated for timestamps and transaction amounts, ensuring realistic variability. Each transaction includes a unique ID, timestamp, value, and wallet address. To avoid clustering by wallet, the dataset is shuffled after creation. The output is formatted as a CSV, ready for ingestion into tools like PySpark. This approach enables testing SQL window functions by calculating user activity frequencies or detecting anomalies.

I’m saving the output in a file called transactions.csv.

It looks like:

transaction_id,timestamp,wallet_address,transaction_value
b6b5d128-9466-48c5-ac2c-3a3c4d8fd71e,2024-12-13 13:49:07.335628,wallet_4,39.61
cea7a9a4-c0f3-4723-a462-afa614b60278,2024-12-13 14:17:19.335769,wallet_5,2.38
52d1f0b8-ce2b-4822-8294-09d13b9d2895,2024-12-13 13:50:31.335349,wallet_2,25.74
31676804-b96e-4b04-bf95-86df50212584,2024-12-13 13:48:17.335489,wallet_3,18.59
2d78e14f-0858-480a-812f-f8df1d700e67,2024-12-13 13:52:53.335489,wallet_3,22.06
d33837b3-f03b-4942-b0d0-3f6818d32882,2024-12-13 13:57:39.335489,wallet_3,15.4
7b03ce5d-6bd0-4053-a45d-3953484db9f8,2024-12-13 13:51:44.335769,wallet_5,94.19
8a1a52e7-b64a-4821-a551-e0dbdfca9f25,2024-12-13 14:01:20.335144,wallet_1,94.66
7fcf72bf-ec8e-430a-ba14-3fc63de0669d,2024-12-13 13:50:22.335489,wallet_3,79.09
c69f53f8-4832-42d2-ba0a-fdd9f1083d96,2024-12-13 14:15:43.335769,wallet_5,59.25
011dfa57-0189-44ef-84b5-1114f7f5a579,2024-12-13 14:01:59.335144,wallet_1,4.17
c3a8f41a-7690-4edd-9aad-710d7f126895,2024-12-13 13:48:57.335349,wallet_2,6.55
8a322b04-9a46-4c28-b8f8-9291cea55b2a,2024-12-13 14:04:21.335628,wallet_4,33.39
d1efad98-5c4b-471e-955c-d5d2773bd9ad,2024-12-13 13:50:07.335349,wallet_2,18.53
51f86056-8e89-45c2-8cee-f859f0644e09,2024-12-13 13:49:15.335349,wallet_2,15.8
0302e89e-7a32-41b2-a187-6ef45a6348b3,2024-12-13 14:13:28.335628,wallet_4,47.93
51c83df2-4fc2-47ac-a54a-e17e6ca52624,2024-12-13 13:53:59.335349,wallet_2,46.96
fa427c21-ec55-4968-9091-920d76e7cf03,2024-12-13 14:15:01.335769,wallet_5,26.3
56da405f-e7b9-4a1b-9fee-0c782f3ab3e8,2024-12-13 14:18:44.335769,wallet_5,48.57
f240d9ff-b437-4507-83a0-5504e531cbd6,2024-12-13 14:17:49.335628,wallet_4,71.24
29a65049-8d86-4616-93f5-f14e5ef1d52f,2024-12-13 13:50:36.335349,wallet_2,12.6
4e7d98a8-17ff-4955-9ded-a3147f3fc8eb,2024-12-13 13:49:48.335144,wallet_1,6.63
e31f6490-45e3-48b0-a25a-127ac0961bfe,2024-12-13 14:24:32.335628,wallet_4,96.45
cfedd64c-48df-4c29-beea-6188806f642f,2024-12-13 13:51:27.335489,wallet_3,21.55
0f5907e7-adbe-45ca-8769-77dad8441c52,2024-12-13 13:57:49.335144,wallet_1,84.33
1466ec97-daf7-40e5-aeee-7f78cf221342,2024-12-13 14:05:47.335144,wallet_1,94.49
523c4f43-6931-44ee-a431-5c251d6a3e5d,2024-12-13 14:04:25.335144,wallet_1,40.42
a889d772-a4b8-43a4-8af3-513f0aa66998,2024-12-13 14:10:28.335769,wallet_5,57.85
b0355521-ac48-4a81-8931-ecf753c24ad2,2024-12-13 14:04:20.335628,wallet_4,28.91
4d22cdf5-4d0a-4e75-bc91-35f15689b36f,2024-12-13 13:54:09.335144,wallet_1,55.44
4b0b88b2-3130-413a-9b87-72f2c039db86,2024-12-13 13:58:48.335144,wallet_1,37.63
1cccc256-f0b3-4f7a-9f3a-46cbd2005fe0,2024-12-13 13:55:25.335349,wallet_2,27.15
2467e6c4-1b5d-4a5f-8349-75ebdc051289,2024-12-13 14:23:57.335628,wallet_4,86.5
19896dbe-a466-4174-9e13-ef5fad7610ea,2024-12-13 14:00:01.335769,wallet_5,19.62
9745eedb-f9cd-4846-bd37-37879927d942,2024-12-13 14:05:32.335769,wallet_5,2.55
1154bc6d-d94c-44e8-9eaf-d178f1faf0bf,2024-12-13 14:02:42.335769,wallet_5,36.77
98cb1331-d8ee-40b9-912f-2b3c3a957896,2024-12-13 13:50:04.335144,wallet_1,23.64
36ed2ecd-ad15-4394-8e0f-b74819ed5bbf,2024-12-13 13:56:25.335628,wallet_4,20.56
b3640eeb-463e-418a-9f3b-7b8f99b23db2,2024-12-13 13:49:00.335144,wallet_1,18.42
93f617e1-dc15-4573-9033-6512f5001c80,2024-12-13 14:12:38.335769,wallet_5,44.17
ea71b187-6537-4a63-a806-085069daa1bd,2024-12-13 13:49:01.335489,wallet_3,46.95
5fad9453-9e38-4c92-80c2-95f1043d1cae,2024-12-13 13:58:03.335489,wallet_3,97.48
e954591b-456e-4ea0-b1c3-5464cef8517b,2024-12-13 14:00:54.335628,wallet_4,86.13
c9cbe0b3-e46a-4190-b49c-2fd21dc96116,2024-12-13 14:15:28.335769,wallet_5,88.02
e6819671-73b3-4a80-be20-ac87d40b6e9a,2024-12-13 14:00:01.335489,wallet_3,84.29
c57fd534-2bd6-4b73-b569-068fc56fd0cb,2024-12-13 13:49:38.335144,wallet_1,36.24
ad7f43db-179f-4c55-91f9-5825d1a4d7f9,2024-12-13 13:51:27.335349,wallet_2,88.39
3e9e09ea-ff15-4cfd-b2b3-13547dc7a251,2024-12-13 13:51:28.335489,wallet_3,59.25
e1ae36c2-2dc7-46dd-8d68-37a6e2f8c70e,2024-12-13 13:49:41.335144,wallet_1,85.65
f6d22deb-4732-49d8-a0d1-7c27a0b58484,2024-12-13 13:50:56.335489,wallet_3,96.0
3aabe437-93c9-43e4-b8b2-0894fbfb664d,2024-12-13 14:06:42.335628,wallet_4,2.46
c10480dd-5c26-4821-ae21-d93a66d6f7cd,2024-12-13 14:03:59.335628,wallet_4,33.57
149747bf-f817-483e-8abc-f8b786a35bc3,2024-12-13 13:49:14.335769,wallet_5,62.35
6c90922e-7592-4911-a982-9310a10b777c,2024-12-13 14:12:57.335628,wallet_4,0.66
d0d2fe87-720d-43af-9850-b8c5020bae4d,2024-12-13 14:20:07.335628,wallet_4,98.79
d6c93c72-41a1-41a3-b702-d016588d1fd9,2024-12-13 14:12:14.335769,wallet_5,46.18
4b3c7d43-c650-44e6-9a54-0c14868863c8,2024-12-13 13:58:27.335489,wallet_3,28.32
b48e2d4f-1f59-4ba6-acb4-692e4e80e9e0,2024-12-13 13:58:43.335489,wallet_3,82.07
329e031a-d6a8-423e-8c16-7924653c6726,2024-12-13 14:16:48.335769,wallet_5,46.2
c1591797-0c3d-4b76-8ba4-aca7d157b79c,2024-12-13 13:51:13.335144,wallet_1,26.53
0e48bab1-b504-4523-8aed-e8c14de89912,2024-12-13 14:01:28.335144,wallet_1,57.86
e2666162-818e-4e25-b5f3-e864e4ebe379,2024-12-13 13:52:15.335489,wallet_3,4.56
4caeaf6a-42c0-4927-bead-5107ddb1f41b,2024-12-13 14:01:40.335349,wallet_2,89.33
eddacc8f-2bb0-4ccb-8932-23fc8c9be373,2024-12-13 14:10:03.335628,wallet_4,61.55
05029b65-fd32-41d0-8d67-09a6ed0137a6,2024-12-13 13:57:28.335349,wallet_2,85.25
b5c0afa3-aa57-4919-87a8-c0d1c6b17cd0,2024-12-13 13:55:54.335769,wallet_5,71.11
0200b9e8-b8fc-4a53-b882-faa6c7411ade,2024-12-13 13:48:51.335349,wallet_2,16.46
136ec951-8aa8-4b5d-9b7b-70eef2703e6e,2024-12-13 13:51:05.335489,wallet_3,79.37
ac3c21cd-59b4-4601-9339-6f43d9c3e2f9,2024-12-13 14:11:21.335769,wallet_5,7.33
4489a59a-3631-4a95-8bd4-9ec27a1b1151,2024-12-13 13:55:03.335144,wallet_1,23.91
f4e022de-3538-4d26-9564-c0fab4627665,2024-12-13 14:00:15.335489,wallet_3,42.37
3c812b0c-f744-4789-b0d9-b96f6a204947,2024-12-13 14:00:03.335489,wallet_3,79.03
72d32b88-d666-4f12-90d2-0f3c4235d5a8,2024-12-13 13:59:47.335349,wallet_2,7.99
ea0d22fb-04d1-4e06-9d5e-acde120359f5,2024-12-13 14:03:36.335144,wallet_1,71.94
f54e7b21-e08c-4c8d-b75a-1915f9e97c22,2024-12-13 13:51:54.335349,wallet_2,39.98
f81cb6d3-fbae-4ba2-8ae4-c2897cf5adce,2024-12-13 14:14:13.335628,wallet_4,73.98
d87fb231-c6a7-4d51-831b-8d7f32550b0d,2024-12-13 13:59:23.335489,wallet_3,34.93
1dfa57b8-2c36-45c5-aad6-60729270885d,2024-12-13 13:50:59.335628,wallet_4,21.11
17217d47-5e96-4ccd-8655-1e4b60235973,2024-12-13 13:49:49.335769,wallet_5,50.24
4cad3ec4-c6f5-4143-a4a4-8b4a56895977,2024-12-13 13:58:52.335349,wallet_2,43.76
7e5307c7-904e-4341-9788-b03788d5c883,2024-12-13 14:06:19.335349,wallet_2,80.59
482055d8-3f90-406e-957d-df05a07a0015,2024-12-13 14:09:47.335769,wallet_5,89.16
1262b965-c687-4b94-bc2b-22b8863330fa,2024-12-13 14:04:48.335144,wallet_1,21.13
b92a8f82-4776-43fb-ba1f-dd85a5c43210,2024-12-13 13:52:25.335349,wallet_2,26.7
7127b03a-8f72-4569-bd60-6d48573d2c82,2024-12-13 13:50:37.335144,wallet_1,97.7
d9f82045-709d-4343-9041-99e8ed2b8f2d,2024-12-13 14:00:46.335349,wallet_2,86.29
7d5c0799-e69e-4595-8f74-6c0ea51ae827,2024-12-13 13:58:20.335349,wallet_2,89.93
5503ca7f-378d-4aaf-9a33-1b485479eb10,2024-12-13 13:49:33.335489,wallet_3,40.65
afb9710a-f624-4997-925f-d03596f88113,2024-12-13 13:54:21.335144,wallet_1,98.69
338d5e9a-1322-462a-9e4b-14d276f0f873,2024-12-13 14:07:00.335349,wallet_2,41.88
0d3eab82-5ba6-4866-8c30-ff2e7e61ebe1,2024-12-13 13:53:25.335489,wallet_3,59.7
c8f1d4ed-b4c6-45de-b4da-bfacb108360a,2024-12-13 14:18:01.335769,wallet_5,1.82
112681ff-4b90-47d0-bcc2-37f87350a178,2024-12-13 14:13:14.335769,wallet_5,59.18
adc8bae7-f4de-4072-ab74-7ff613fc0e76,2024-12-13 14:00:35.335628,wallet_4,94.26
7c6f3555-b1a7-4c8b-968e-107ce366700d,2024-12-13 13:52:01.335144,wallet_1,17.85
c613a435-612a-4e3d-9683-796cbaad2c7d,2024-12-13 13:51:53.335628,wallet_4,74.31
9f49d990-e6ad-44ab-8e4a-49d7b3fd7f8d,2024-12-13 13:55:40.335628,wallet_4,88.9
c78be9ec-8ef5-4f43-b2dc-cc728ec9c9d2,2024-12-13 13:51:14.335489,wallet_3,2.04
78680e2b-eb74-4b9b-b493-30f00c36165a,2024-12-13 13:49:50.335349,wallet_2,9.33
ffa24c02-a380-4ca6-b81a-797eefc7d895,2024-12-13 14:18:33.335628,wallet_4,51.89

2. Use PySpark to read the file

Here we read the file using spark.read:

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder \
    .appName("Bitcoin Transactions Processing") \
    .getOrCreate()

# File path to the CSV file
file_path = "transactions.csv"

transactions = spark.read.csv(file_path, header=True, inferSchema=True)

print(transactions.show())

transactions here is a dataframe and we get:

(env) ➜ python3 transactions.py
24/12/13 14:16:10 WARN Utils: Your hostname, Clements-MacBook-Pro.local resolves to a loopback address: 127.94.0.3; using 192.168.0.15 instead (on interface en0)
24/12/13 14:16:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/12/13 14:16:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
+--------------------+--------------------+--------------+-----------------+
|      transaction_id|           timestamp|wallet_address|transaction_value|
+--------------------+--------------------+--------------+-----------------+
|b6b5d128-9466-48c...|2024-12-13 13:49:...|      wallet_4|            39.61|
|cea7a9a4-c0f3-472...|2024-12-13 14:17:...|      wallet_5|             2.38|
|52d1f0b8-ce2b-482...|2024-12-13 13:50:...|      wallet_2|            25.74|
|31676804-b96e-4b0...|2024-12-13 13:48:...|      wallet_3|            18.59|
|2d78e14f-0858-480...|2024-12-13 13:52:...|      wallet_3|            22.06|
|d33837b3-f03b-494...|2024-12-13 13:57:...|      wallet_3|             15.4|
|7b03ce5d-6bd0-405...|2024-12-13 13:51:...|      wallet_5|            94.19|
|8a1a52e7-b64a-482...|2024-12-13 14:01:...|      wallet_1|            94.66|
|7fcf72bf-ec8e-430...|2024-12-13 13:50:...|      wallet_3|            79.09|
|c69f53f8-4832-42d...|2024-12-13 14:15:...|      wallet_5|            59.25|
|011dfa57-0189-44e...|2024-12-13 14:01:...|      wallet_1|             4.17|
|c3a8f41a-7690-4ed...|2024-12-13 13:48:...|      wallet_2|             6.55|
|8a322b04-9a46-4c2...|2024-12-13 14:04:...|      wallet_4|            33.39|
|d1efad98-5c4b-471...|2024-12-13 13:50:...|      wallet_2|            18.53|
|51f86056-8e89-45c...|2024-12-13 13:49:...|      wallet_2|             15.8|
|0302e89e-7a32-41b...|2024-12-13 14:13:...|      wallet_4|            47.93|
|51c83df2-4fc2-47a...|2024-12-13 13:53:...|      wallet_2|            46.96|
|fa427c21-ec55-496...|2024-12-13 14:15:...|      wallet_5|             26.3|
|56da405f-e7b9-4a1...|2024-12-13 14:18:...|      wallet_5|            48.57|
|f240d9ff-b437-450...|2024-12-13 14:17:...|      wallet_4|            71.24|
+--------------------+--------------------+--------------+-----------------+
only showing top 20 rows


2. Group the transactions by trading sessions

To group transactions by a “trading session,” we define the session as a sequence of transactions from the same wallet where the time gap between consecutive transactions does not exceed a specified threshold (e.g., 60 seconds). Transactions are grouped independently for each wallet to ensure that sessions do not overlap across wallets. If the time gap between two transactions exceeds the threshold, a new session begins. Additionally, global session IDs can be generated to uniquely identify sessions across all wallets, enabling consistent tracking and analysis. This approach is flexible, allowing the time gap threshold to be adjusted based on specific trading or business requirements.

# Convert timestamp to a datetime type
df = transactions.withColumn("timestamp", col("timestamp").cast("timestamp"))


# Define a window partitioned by wallet and ordered by timestamp
window_spec = Window.partitionBy("wallet_address").orderBy("timestamp")

# Calculate the time difference between transactions in seconds
df = df.withColumn(
    "time_diff",
    (col("timestamp").cast("long") - lag("timestamp").over(window_spec).cast("long"))
)

# Define a session indicator based on the 1-minute threshold (60 seconds)
df = df.withColumn(
    "new_session",
    when(col("time_diff") > 60, 1).otherwise(0)
)


# Generate a session ID by cumulatively summing the session indicator within the wallet-specific window
df = df.withColumn(
    "wallet_session_id",
    sum("new_session").over(window_spec)
)

# Generate unique session IDs across all wallets
session_window = Window.orderBy("wallet_address", "wallet_session_id")
df = df.withColumn(
    "global_session_id",
    dense_rank().over(session_window)
)

# Show the results
df.select("transaction_id", "timestamp", "wallet_address", "global_session_id").show()

3. Display the results

Let’s give an overview of the results:

+--------------------+--------------------+--------------+-----------------+---------+-----------+-----------------+-----------------+
|      transaction_id|           timestamp|wallet_address|transaction_value|time_diff|new_session|wallet_session_id|global_session_id|
+--------------------+--------------------+--------------+-----------------+---------+-----------+-----------------+-----------------+
|b3640eeb-463e-418...|2024-12-13 13:49:...|      wallet_1|            18.42|     NULL|          0|                0|                1|
|c57fd534-2bd6-4b7...|2024-12-13 13:49:...|      wallet_1|            36.24|       38|          0|                0|                1|
|e1ae36c2-2dc7-46d...|2024-12-13 13:49:...|      wallet_1|            85.65|        3|          0|                0|                1|
|4e7d98a8-17ff-495...|2024-12-13 13:49:...|      wallet_1|             6.63|        7|          0|                0|                1|
|98cb1331-d8ee-40b...|2024-12-13 13:50:...|      wallet_1|            23.64|       16|          0|                0|                1|
|7127b03a-8f72-456...|2024-12-13 13:50:...|      wallet_1|             97.7|       33|          0|                0|                1|
|c1591797-0c3d-4b7...|2024-12-13 13:51:...|      wallet_1|            26.53|       36|          0|                0|                1|
|7c6f3555-b1a7-4c8...|2024-12-13 13:52:...|      wallet_1|            17.85|       48|          0|                0|                1|
|4d22cdf5-4d0a-4e7...|2024-12-13 13:54:...|      wallet_1|            55.44|      128|          1|                1|                2|
|afb9710a-f624-499...|2024-12-13 13:54:...|      wallet_1|            98.69|       12|          0|                1|                2|
|4489a59a-3631-4a9...|2024-12-13 13:55:...|      wallet_1|            23.91|       42|          0|                1|                2|
|0f5907e7-adbe-45c...|2024-12-13 13:57:...|      wallet_1|            84.33|      166|          1|                2|                3|
|4b0b88b2-3130-413...|2024-12-13 13:58:...|      wallet_1|            37.63|       59|          0|                2|                3|
|8a1a52e7-b64a-482...|2024-12-13 14:01:...|      wallet_1|            94.66|      152|          1|                3|                4|
|0e48bab1-b504-452...|2024-12-13 14:01:...|      wallet_1|            57.86|        8|          0|                3|                4|
|011dfa57-0189-44e...|2024-12-13 14:01:...|      wallet_1|             4.17|       31|          0|                3|                4|
|ea0d22fb-04d1-4e0...|2024-12-13 14:03:...|      wallet_1|            71.94|       97|          1|                4|                5|
|523c4f43-6931-44e...|2024-12-13 14:04:...|      wallet_1|            40.42|       49|          0|                4|                5|
|1262b965-c687-4b9...|2024-12-13 14:04:...|      wallet_1|            21.13|       23|          0|                4|                5|
|1466ec97-daf7-40e...|2024-12-13 14:05:...|      wallet_1|            94.49|       59|          0|                4|                5|
|0200b9e8-b8fc-4a5...|2024-12-13 13:48:...|      wallet_2|            16.46|     NULL|          0|                0|                6|
|c3a8f41a-7690-4ed...|2024-12-13 13:48:...|      wallet_2|             6.55|        6|          0|                0|                6|
|51f86056-8e89-45c...|2024-12-13 13:49:...|      wallet_2|             15.8|       18|          0|                0|                6|
|78680e2b-eb74-4b9...|2024-12-13 13:49:...|      wallet_2|             9.33|       35|          0|                0|                6|
|d1efad98-5c4b-471...|2024-12-13 13:50:...|      wallet_2|            18.53|       17|          0|                0|                6|
|52d1f0b8-ce2b-482...|2024-12-13 13:50:...|      wallet_2|            25.74|       24|          0|                0|                6|
|29a65049-8d86-461...|2024-12-13 13:50:...|      wallet_2|             12.6|        5|          0|                0|                6|
|ad7f43db-179f-4c5...|2024-12-13 13:51:...|      wallet_2|            88.39|       51|          0|                0|                6|
|f54e7b21-e08c-4c8...|2024-12-13 13:51:...|      wallet_2|            39.98|       27|          0|                0|                6|
|b92a8f82-4776-43f...|2024-12-13 13:52:...|      wallet_2|             26.7|       31|          0|                0|                6|
|51c83df2-4fc2-47a...|2024-12-13 13:53:...|      wallet_2|            46.96|       94|          1|                1|                7|
|1cccc256-f0b3-4f7...|2024-12-13 13:55:...|      wallet_2|            27.15|       86|          1|                2|                8|
|05029b65-fd32-41d...|2024-12-13 13:57:...|      wallet_2|            85.25|      123|          1|                3|                9|
|7d5c0799-e69e-459...|2024-12-13 13:58:...|      wallet_2|            89.93|       52|          0|                3|                9|
|4cad3ec4-c6f5-414...|2024-12-13 13:58:...|      wallet_2|            43.76|       32|          0|                3|                9|
|72d32b88-d666-4f1...|2024-12-13 13:59:...|      wallet_2|             7.99|       55|          0|                3|                9|
|d9f82045-709d-434...|2024-12-13 14:00:...|      wallet_2|            86.29|       59|          0|                3|                9|
|4caeaf6a-42c0-492...|2024-12-13 14:01:...|      wallet_2|            89.33|       54|          0|                3|                9|
|7e5307c7-904e-434...|2024-12-13 14:06:...|      wallet_2|            80.59|      279|          1|                4|               10|
|338d5e9a-1322-462...|2024-12-13 14:07:...|      wallet_2|            41.88|       41|          0|                4|               10|
|31676804-b96e-4b0...|2024-12-13 13:48:...|      wallet_3|            18.59|     NULL|          0|                0|               11|
|ea71b187-6537-4a6...|2024-12-13 13:49:...|      wallet_3|            46.95|       44|          0|                0|               11|
|5503ca7f-378d-4aa...|2024-12-13 13:49:...|      wallet_3|            40.65|       32|          0|                0|               11|
|7fcf72bf-ec8e-430...|2024-12-13 13:50:...|      wallet_3|            79.09|       49|          0|                0|               11|
|f6d22deb-4732-49d...|2024-12-13 13:50:...|      wallet_3|             96.0|       34|          0|                0|               11|
|136ec951-8aa8-4b5...|2024-12-13 13:51:...|      wallet_3|            79.37|        9|          0|                0|               11|
|c78be9ec-8ef5-4f4...|2024-12-13 13:51:...|      wallet_3|             2.04|        9|          0|                0|               11|
|cfedd64c-48df-4c2...|2024-12-13 13:51:...|      wallet_3|            21.55|       13|          0|                0|               11|
|3e9e09ea-ff15-4cf...|2024-12-13 13:51:...|      wallet_3|            59.25|        1|          0|                0|               11|
|e2666162-818e-4e2...|2024-12-13 13:52:...|      wallet_3|             4.56|       47|          0|                0|               11|
|2d78e14f-0858-480...|2024-12-13 13:52:...|      wallet_3|            22.06|       38|          0|                0|               11|
|0d3eab82-5ba6-486...|2024-12-13 13:53:...|      wallet_3|             59.7|       32|          0|                0|               11|
|d33837b3-f03b-494...|2024-12-13 13:57:...|      wallet_3|             15.4|      254|          1|                1|               12|
|5fad9453-9e38-4c9...|2024-12-13 13:58:...|      wallet_3|            97.48|       24|          0|                1|               12|
|4b3c7d43-c650-44e...|2024-12-13 13:58:...|      wallet_3|            28.32|       24|          0|                1|               12|
|b48e2d4f-1f59-4ba...|2024-12-13 13:58:...|      wallet_3|            82.07|       16|          0|                1|               12|
|d87fb231-c6a7-4d5...|2024-12-13 13:59:...|      wallet_3|            34.93|       40|          0|                1|               12|
|e6819671-73b3-4a8...|2024-12-13 14:00:...|      wallet_3|            84.29|       38|          0|                1|               12|
|3c812b0c-f744-478...|2024-12-13 14:00:...|      wallet_3|            79.03|        2|          0|                1|               12|
|f4e022de-3538-4d2...|2024-12-13 14:00:...|      wallet_3|            42.37|       12|          0|                1|               12|
|b6b5d128-9466-48c...|2024-12-13 13:49:...|      wallet_4|            39.61|     NULL|          0|                0|               13|
|1dfa57b8-2c36-45c...|2024-12-13 13:50:...|      wallet_4|            21.11|      112|          1|                1|               14|
|c613a435-612a-4e3...|2024-12-13 13:51:...|      wallet_4|            74.31|       54|          0|                1|               14|
|9f49d990-e6ad-44a...|2024-12-13 13:55:...|      wallet_4|             88.9|      227|          1|                2|               15|
|36ed2ecd-ad15-439...|2024-12-13 13:56:...|      wallet_4|            20.56|       45|          0|                2|               15|
|adc8bae7-f4de-407...|2024-12-13 14:00:...|      wallet_4|            94.26|      250|          1|                3|               16|
|e954591b-456e-4ea...|2024-12-13 14:00:...|      wallet_4|            86.13|       19|          0|                3|               16|
|c10480dd-5c26-482...|2024-12-13 14:03:...|      wallet_4|            33.57|      185|          1|                4|               17|
|b0355521-ac48-4a8...|2024-12-13 14:04:...|      wallet_4|            28.91|       21|          0|                4|               17|
|8a322b04-9a46-4c2...|2024-12-13 14:04:...|      wallet_4|            33.39|        1|          0|                4|               17|
|3aabe437-93c9-43e...|2024-12-13 14:06:...|      wallet_4|             2.46|      141|          1|                5|               18|
|eddacc8f-2bb0-4cc...|2024-12-13 14:10:...|      wallet_4|            61.55|      201|          1|                6|               19|
|6c90922e-7592-491...|2024-12-13 14:12:...|      wallet_4|             0.66|      174|          1|                7|               20|
|0302e89e-7a32-41b...|2024-12-13 14:13:...|      wallet_4|            47.93|       31|          0|                7|               20|
|f81cb6d3-fbae-4ba...|2024-12-13 14:14:...|      wallet_4|            73.98|       45|          0|                7|               20|
|f240d9ff-b437-450...|2024-12-13 14:17:...|      wallet_4|            71.24|      216|          1|                8|               21|
|ffa24c02-a380-4ca...|2024-12-13 14:18:...|      wallet_4|            51.89|       44|          0|                8|               21|
|d0d2fe87-720d-43a...|2024-12-13 14:20:...|      wallet_4|            98.79|       94|          1|                9|               22|
|2467e6c4-1b5d-4a5...|2024-12-13 14:23:...|      wallet_4|             86.5|      230|          1|               10|               23|
|e31f6490-45e3-48b...|2024-12-13 14:24:...|      wallet_4|            96.45|       35|          0|               10|               23|
|149747bf-f817-483...|2024-12-13 13:49:...|      wallet_5|            62.35|     NULL|          0|                0|               24|
|17217d47-5e96-4cc...|2024-12-13 13:49:...|      wallet_5|            50.24|       35|          0|                0|               24|
|7b03ce5d-6bd0-405...|2024-12-13 13:51:...|      wallet_5|            94.19|      115|          1|                1|               25|
|b5c0afa3-aa57-491...|2024-12-13 13:55:...|      wallet_5|            71.11|      250|          1|                2|               26|
|19896dbe-a466-417...|2024-12-13 14:00:...|      wallet_5|            19.62|      247|          1|                3|               27|
|1154bc6d-d94c-44e...|2024-12-13 14:02:...|      wallet_5|            36.77|      161|          1|                4|               28|
|9745eedb-f9cd-484...|2024-12-13 14:05:...|      wallet_5|             2.55|      170|          1|                5|               29|
|482055d8-3f90-406...|2024-12-13 14:09:...|      wallet_5|            89.16|      255|          1|                6|               30|
|a889d772-a4b8-43a...|2024-12-13 14:10:...|      wallet_5|            57.85|       41|          0|                6|               30|
|ac3c21cd-59b4-460...|2024-12-13 14:11:...|      wallet_5|             7.33|       53|          0|                6|               30|
|d6c93c72-41a1-41a...|2024-12-13 14:12:...|      wallet_5|            46.18|       53|          0|                6|               30|
|93f617e1-dc15-457...|2024-12-13 14:12:...|      wallet_5|            44.17|       24|          0|                6|               30|
|112681ff-4b90-47d...|2024-12-13 14:13:...|      wallet_5|            59.18|       36|          0|                6|               30|
|fa427c21-ec55-496...|2024-12-13 14:15:...|      wallet_5|             26.3|      107|          1|                7|               31|
|c9cbe0b3-e46a-419...|2024-12-13 14:15:...|      wallet_5|            88.02|       27|          0|                7|               31|
|c69f53f8-4832-42d...|2024-12-13 14:15:...|      wallet_5|            59.25|       15|          0|                7|               31|
|329e031a-d6a8-423...|2024-12-13 14:16:...|      wallet_5|             46.2|       65|          1|                8|               32|
|cea7a9a4-c0f3-472...|2024-12-13 14:17:...|      wallet_5|             2.38|       31|          0|                8|               32|
|c8f1d4ed-b4c6-45d...|2024-12-13 14:18:...|      wallet_5|             1.82|       42|          0|                8|               32|
|56da405f-e7b9-4a1...|2024-12-13 14:18:...|      wallet_5|            48.57|       43|          0|                8|               32|
+--------------------+--------------------+--------------+-----------------+---------+-----------+-----------------+-----------------+

4. Conclusion

The sessionization logic works effectively, grouping transactions into sessions based on defined time gaps between consecutive transactions. Transactions within a session exhibit consistent behavior, with no session breaks unless the time gap exceeds the specified threshold. New sessions are correctly identified when larger gaps occur, and each session is assigned a unique session ID. Wallets are processed independently, ensuring that session calculations are not influenced by transactions from other wallets. The global session IDs provide a unified numbering system across all wallets, enabling seamless identification and tracking of sessions. The logic handles edge cases, such as missing previous transactions for the first entry in a wallet, without disrupting the analysis. This approach is robust and scalable, making it suitable for analyzing both frequent and sparse transaction patterns across diverse datasets and use cases.

See also  Nvidia GeForce GTX 1080 Mobile: Review

Related Posts