Design Document Capstone 2025 CS.026 Personal Data Acquisition
1. Overview
1.1 Introduction
In the field of data acquisition, there exists a significant gap between high-cost professional solutions and low-budget DIY alternatives. Our project aims to bridge this gap by developing an easy-to-use, low-cost, modular system designed to accommodate a diverse range of needs. This document provides an in-depth overview of the architectural decisions we’ve made to help guide our system development. By prioritizing affordability, flexibility, and modularity, we will offer a solution that lowers the barriers into the data acquisition field.
1.2 Goals and Principles
As reflected in the requirements document, the key architectural goals are accessibility, flexibility, modularity, ease of use, security, and performance. We want our system to be low cost, easy to set up, and provide easy to understand data. It should be usable for someone without deep technical knowledge or with limited time to learn a new system. Along with the goal of a lightweight, easy to pick up system, we want it to perform well without the user needing to spend time testing and configuring the hardware and software.
1.3 Technologies, Frameworks, and Patterns
At a broad level we will be using an n-tiered architecture pattern to emphasize separation of concerns while improving scalability, reusability, modularity, and encapsulation. The n-tier architecture splits the software into multiple distinct layers that provide a logical division of functions, the typical n-tiered system splits the project into three layers: UI, Business Logic (BLL), and Data Access (DAL). The n-tiered pattern dependency flows down, which means the UI depends on the BLL, which in turn depends on the DAL. The resulting downside of the n-tiered pattern is the BLL requires a valid DAL to compile and run which increases development and testing complexity.
Using a Microservice architecture could address some of these issues by decomposing the application into small, independent services that can be developed, deployed, and scaled independently. However, microservices introduce significant overhead in terms of infrastructure management, requiring tools like Kubernetes for orchestration. For our project, which does not require the extreme scalability or distributed processing that microservices provide, the added complexity would outweigh the benefits. As a scalability solution, our BLL and DAL will be hosted through AWS which will allow both resources to easily be scaled and deployed in different geographic regions as the project needs change. As for the rest of the project, the UI is simply hosted on the user's device and the hardware is user specific without the need to scale.
To remove the layer interdependency without committing to microservices, we can focus on the clean (also known as hexagonal, ports-and-adapters, and onion) architecture pattern along with dependency injection as the main driving focus which focuses on decoupling the BLL and DAL systems. Using the clean architecture as well as asynchronous messaging, as the main systems involve networking and web interaction, the tiers can be decoupled entirely which will also allow an increase in development flexibility as each layer can be developed and tested independently. In the clean architecture pattern, the decoupling between the layers is achieved by deferring communication to abstracted functions that each representative layer implements.
Figure 1 below contains an overview of our project's clean architecture design which will be decoupled into five different system categories; frontend, backend, hardware, database, and external APIs. The core of the diagram, the backend, has no dependencies on any other parts but communicates through an abstracted interface. The middle layer consists of the frontend, database, and hardware components, each of which communicates with the backend through the abstracted functions or external APIs but never directly within the layer. Finally, the external APIs rely on functionality created by the middle layer and are therefore dependent on it. The complete backend decoupling along with the abstracted interfaces allows each component and layer to be developed independently of each other.
| Figure 1: Architecture Design Diagram |
1.3.1 UI Layer - Frontend
The primary UI layer will be a Rust based web app running on the users' device, utilizing egui along with the plot extension for graphing functionality to render a custom interface in a pure Rust environment. Egui offers OpenGL integration for 3D and general shader access through either glow or wgpu allowing low-level access to WebGL2 or WebGPU. The UI layer will communicate with the BLL through an abstracted TCP API which sends requests following a client-server pattern.
1.3.2 Business Logic Layer - Backend
The primary BL layer will be a Rust based TCP server which implements the abstracted TCP API the UI layer will interface with. The BLL will request data, parse it according to its business rules, and then return the result to the UI layer. It acts as a ‘black box’ of application logic where the UI requests are decoded to make the relevant DAL requests through a separate, abstracted DAL API. The TCP API will follow a RESTful API format and is expanded upon in the Data Architecture section.
1.3.3 Data Access Layer - Database
The DA layer will implement the abstracted DAL API to facilitate communication between a respective database and the BLL. The primary DAL will implement communication for an AWS relational database utilizing Amazon RDS. The DAL API will follow a RESTful API format and is expanded upon in the Data Architecture section.
1.3.4 Hardware
Hardware implementation for this project will consist of six sensors, being created by the ECE Capstone team, as well as a data transfer module built by the CS Capstone team that consists of a single board computer to extract the data from the sensors and transfer the data wirelessly via Wi-Fi or cellular technologies. The Hardware will also implement communication with a local SQLite3 database utilizing the rustqlite crate to allow data to be stored when missing a network connection.
The ECE's decision making originally caused us to work with some unknowns and create our own hardware design portion of the project, which simplified the overall communication of our projects in the end. We collaborated and established a protocol that will lead to simple integration of our two team's portions of the project. Recently their communication process has allowed us to hone in on some of the more minute details of the integration process.
2. Core Components
2.1 User Interface
The user interface will start on a Login page with text inputs for a username and password. Once logged-in, the Home page will be displayed with sections that can be clicked on to view in more detail. Sections will include: a session history window, a device connection window, and an account information window. Here, the user will be able to select a session dataset to view graphs and other information, or export the data to a document for download. Historical session datasets will be visibly sorted by a variety of options such as by date and duration. Throughout the UI, the device connection section will display date, time, and other information.
Once an old session is selected or a new session is created, the Sensor Readout page will be displayed. The Readout page will include: configuration windows, a session control window, the device connection window, and the sensor data display window. The historical or live data will be shown here, as well as important status information such as latency, uptime, and which sensors are operating. When multiple sensors are operating and transmitting data at once, there will be multiple panels showing visualized data relevant to each. If those sensors are offline, the panels will be grayed out with a message telling the user of the status (errors, disabled in settings, etc). Data will be displayed through relevant graphs or maps, and the user will be able to select the format they prefer.
As the device and web app are operating, errors will be displayed as pop-ups with relevant information and a button to dismiss. Throughout the app, colors will be persistent and dictated by the selected options in settings. The UI will make use of clearly understandable icons to prevent text overcrowding, as well as simple animations for interaction and transitions between pages. The UI will also adapt to different aspect ratios and screen sizes.
2.1.1 Configuration
In order to keep things as accessible as possible, the user will be able to configure the device through the UI on the web app. Sections to include will be sensor calibration and testing, data filtering and visualization settings, and UI settings such as theme, colorblind modes, and light/dark mode.
2.1.2 Low-Fidelity Design
| Figure 2: Login Page Excalidraw Design |
| Figure 3: Home Page Excalidraw Design |
| Figure 4: Sensor Readout (Single) Page Excalidraw Design |
| Figure 5: Sensor Readout (Multi) Page Excalidraw Design |
2.1.3 High-Fidelity Design
| Figure 6: Login Page Figma Design |
| Figure 7: Home Page Figma Design |
| Figure 8: Sensor Readout (Table) Page Figma Design |
| Figure 9: Sensor Readout (Multi) Page Figma Design |
| Figure 10: Sensor Readout (Map) Page Figma Design |
2.2 Authentication and Security
Communication and data transfer, real-time or otherwise, will be secured using a system modeled after Transport Layer Security (TLS) using Diffie-Hellman (DH) for traffic encryption key (TEK) exchange, digital signatures for authentication, HMAC for integrity, AES-128-CBC for encryption, along with periodic key checks to ensure security. AES-256-CBC could also be used for increased security at the cost of performance, but AES-128-CBC was chosen due to the performance increase and the security level of our use case. Security during user authentication could further be increased through the use of multifactor authentication (MFA) or one time passwords (OTP) to reduce the frequency of sensitive data transfer.
The user password will be processed using Argon2id to ensure data security. All user secrets the software doesn’t need to know will be transferred, stored, and processed as the SHA-256 hash of the user secret to protect against user secret leaks due to data breaches. Security could be increased as needed in the future using salting and additional hash functions. Any user secrets the software requires access to will be stored encrypted with AES-256-CBC using a randomly generated 256-bit data access key (DAK). The DAK will also be stored encrypted with AES-256-CBC with a 256-bit key encryption key (KEK) which will be stored in a secure external key management system such as AWS or Google Cloud key management service.
Simple password rules such as an eight character minimum, no passwords matching a username, and number/symbol requirements will be enforced to help users choose a secure password. To further help users ensure security, a bloom filter could be added at a later date to ensure users don’t pick common or compromised passwords.
2.3 Hardware
The Data logger is being designed using a Raspberry Pi as the central controller and data collector. Based on this design plan, we are using a Raspberry Pi with a 4G LTE module to implement increased data transfer speed and distance capability. By using a Raspberry Pi 5, we will have the capability to transfer data via USB to USB, Bluetooth, Wi-Fi. With the addition of a 4G LTE module, our system will be able to provide real-time data anywhere there is cellular service. We will be implementing a Bluetooth or USB to USB connection from the hardware sensors the ECE team is developing to the Data logger. Additionally, we will be able to have a built-in display for battery status and cellular connection.
Other design considerations we have accounted for with this hardware is that we will use a battery power pack for the power supply. We also will create a water-resistant case that allows for USB connections to the Raspberry Pi and the antenna for the cellular data.
3. Data Architecture
3.1 Abstracted APIs
Moved to API specification document.
3.2 ER Diagram
| Figure 11: Entity Relationship Diagram |
3.3 Data Flow Diagram
| Figure 12: Data Flow Diagram |
4. Testing Strategies
Our methods for testing the integrity of both the device and our software handling the data, user-interface, and other elements will consist of a mix of automated and manual testing. We have access to simulation data, which can be used in the automated tests to ensure that data is being handled correctly from the database to the application.
Automated testing can validate our backend functionality such as the data logging on the device, data conversion & transmission to the server, and database storage & retrieval. Implementing unit tests will be the easiest route for the more independent cases like logging & data conversion. Integration tests can be used for the parts that involve multiple components of the system, like data transmission and retrieval. Additionally, we will implement a suite of tests with the goal of tracking the performance of the software and comparing it to an ideal range. The latency of data transmission needs to be tracked, and the amount of time it takes to process and display data should be tracked as well. Rust has a built-in testing command (cargo test) that can be used to run all of these tests.
Manual tests will be needed for user-interface (UI) and user-experience (UX) testing, which would involve gathering real test users who will use the application with simulation data. The goal of these tests is to ensure that the UI is responsive and provides useful information without confusing the user. After the user has completed the pre-defined tasks, there will be a survey for them to complete. This allows us to collect feedback and metrics on which parts of the application need improvement and if the overall design meets the user’s needs.
When the physical device enters the prototyping stage, manual tests with real world data will need to be performed involving the entire system. Ideally, these tests would be performed in the same situations where the data logger will be used, such as a Global Formula Racing kart at OSU. Data from each sensor should be collected and verified to make certain that the device will work in real world conditions.
5. Considerations
This section outlines items we have considered for optimizing system performance, planning for maintenance, and ensuring long-term support. To enhance performance, potential bottlenecks such as data transmission rates, memory usage, database queries, and graphics rendering will be analyzed. Tools will be developed to monitor hardware usage, latency, and other metrics, with an optional dev mode for easier troubleshooting. Stress testing and iterative optimization will ensure the system meets listed performance goals. Maintenance practices will be established during development and continued post-deployment while scalability options will be explored to enhance future capabilities.
5.1 Optimizing System Performance
To optimize performance, both for our device and application, we will first need to analyze bottlenecks. Expected bottlenecks include:
- Data transmission rates and latency
- Memory and local storage usage
- Database queries
- Data filtering/analysis algorithms
- Graphics rendering
In order to measure the impact these aspects might have, we will develop tools to monitor hardware usage, performance, latency, and other metrics that will be important to know. Some of these metrics will be visible to the user, but an optional dev mode will be used to make things easier to see. This will also provide a benefit to the end user so they can troubleshoot on their own. Once we know where our bottlenecks are, we can compare performance to our expected values, and optimize our code to improve it. Some reasonable performance goals include:
- 500ms or less response time for user interaction and navigation in the web app
- 2-3 second of latency for “real-time” data transmission, and frequent refreshing to prevent delays between data receiving and display
- Fast graphics loading, no more than a few seconds
During development we will continually test to make sure that performance is within a reasonable distance of our goals to prevent being locked in to a method that performs poorly but is a core and difficult to change aspect of our system. We will conduct stress testing on our systems to determine aspects like data transmission sizes, refresh rates, and so on. Here are some choices we have made that will help improve the performance of our systems:
- Keep minimal data stored on the recording device to save storage and memory. It should be uploaded to a remote database and then cleared from memory
- Compare algorithms for data filtering and analysis to determine best performance
- Optimize database queries, as they can be a major slowdown
- Optimize graphics in the UI, as it can be a major use of system resources, and test on devices with different graphics hardware to ensure performance stays consistent
5.2 Planning for Maintenance and Long-Term Support
5.2.1 Maintenance
During development, we will get into the habit of performing several maintenance tasks that will continue after deployment. We will create periodic and well-organized backups of our codebase, previous versions, testing methods, and data. User data will be stored separately and securely, following the security procedures outlined in this document. During development, a latest stable build will be kept available, and a test build will be worked on. This will continue post-deployment. There should always be a functional and tested build available. A team of willing group members will monitor bugs and perform bug fixes after deployment during the lifespan of our project.
5.2.2 Scalability
After successful deployment of our project, we can expand upon it with newer features that will enhance its capabilities. A proper user account system allowing for cloud storage of data and monitoring on multiple devices is something we would consider pursuing. Another option would be to build a modular system for sensors, allowing the user to customize their device and upgrade it later.
5.2.3 User Support and Documentation
It’s too early in our project to determine exact specifics of our user support system. As we come closer to deployment these details will be ironed out. Currently we envision user support being provided through a simple ticket system with communication through email. User suggestions can also be taken this way. These systems will be handled through our project website. For documentation, we will write a comprehensive documentation article that will be hosted on our website. It will include an overview of features, a UI guide, tutorials and examples, and FAQs. It will make good use of images to aid users and will be divided into chapters for easy navigation.
6. Programming Style Guide
Group members should adhere to the following guidelines to ensure that the code is simple, clear, and consistent regardless of the author.
6.1 Naming Conventions
Names should be straightforward and descriptive. Generally, avoid abbreviations and single-letter names unless they are self-explanatory. In the case of ambiguous naming that is unavoidable, add a comment explaining the name. For variables and classes, use nouns or noun phrases, and for functions and methods, use verbs. Lastly, names should be easily pronounceable in English to make it easier to collaborate.
The following table summarizes the preferred typographical conventions for languages used in the project:
| Constants | Variables | Types | Functions | Modules | |
|---|---|---|---|---|---|
| Rust | UPPER_CASE | snake_case | UpperCamelCase | snake_case | snake_case |
| Python | UPPER_CASE | snake_case | UpperCamelCase | snake_case | lowercase |
| Table 1: Project Naming Conventions |
6.2 Spacing and Indentation
Include spaces on either side of infix operators (e.g., \=, +, -, *, /, etc.). An exception to this rule is the use of colons (:) and double colons (::) for slicing in Python and as a path separator in Rust.
Always place a space after a comma and never before. Spaces should precede parentheses, except for in function calls. Open curly braces and colons used to indicate new blocks of code should never be on their own line and should always be followed by a new line.
Separate items and statements with either one or two newlines. Use whitespace at logical breakpoints to increase readability of the code. Use four spaces for indentation. Never mix spaces and tabs.
The following example demonstrates preferred spacing and indentation for functions, function calls, and expressions in Rust:
| Figure 13: Rust Formatting Preference Example |
6.3 Comments
Use comments to add context or explain choices that are not already understandable through thoughtful naming or structure. Comments should be meaningful; do not use them for self-explanatory code (e.g., // Increment x by 1). When working on code, make use of detailed TODO comments to help direct yourself and your collaborators, keep track of issues, and record ideas. Lastly, remember that comments require maintenance. As such, always edit the corresponding comments when modifying code and avoid leaving redundant comments.
6.4 HTML
The first line should declare the document type. Always include the \<title>, \<html>, \<meta>, \<head>, and \<body> tags. Specifically, make the \<title> element as accurate as possible, and specify \<html lang=”en-us”> and \<meta charset=”utf-8”> to declare the language of the webpage and ensure correct encoding.
Always close non empty HTML elements and add blank lines to increase readability, especially when creating nested elements such as tables or lists. The following image demonstrates how to create a table with proper spacing:
| Figure 14: HTML Table Formatting Preference Example |
6.5 rustfmt
Given that Rust is the primary programming language for the project, it is especially important that group members follow the same guidelines when coding in Rust. This is easily accomplished by using rustfmt, a tool that formats Rust according to the standardized style guidelines. These guidelines include the spacing and indentation rules outlined above.
The rustfmt tool is automatically downloaded when a user installs Rust and can be applied to the current working directory by running cargo fmt on the Stable toolchain or cargo +nightly fmt on the Nightly toolchain. Alternatively, it is recommended that all group members that use Visual Studio Code install the rust-analyzer extension. Editing the settings.json file for Visual Studio Code to include the following lines will ensure that the rustfmt tool is applied whenever a file is saved:
| Figure 15: Adding ‘rustfmt’ for Consistent Formatting |