Skip to content

App 16856: Implement UploadDataFromPath using builtin data manager logic#6113

Open
angelapredolac wants to merge 4 commits into
APP-16855from
APP-16856
Open

App 16856: Implement UploadDataFromPath using builtin data manager logic#6113
angelapredolac wants to merge 4 commits into
APP-16855from
APP-16856

Conversation

@angelapredolac

@angelapredolac angelapredolac commented Jun 12, 2026

Copy link
Copy Markdown
Member

Jira Ticket 16856: Implement UploadDataFromPath using builtin data manager logic

Summary
This implements the core logic behind the UploadDataFromPath API. The previous ticket (wiring PR — base of this branch) plumbed the RPC through RobotService → localRobot → the data manager as a stub that returned "data manager does not support UploadDataFromPath." This PR makes the builtin data manager actually perform the upload: given a file or directory path, it uploads each file to the cloud, deletes the source on success, and returns aggregate counts plus the cloud-assigned binary data ids.
Per discussion, UploadDataFromPath is only expected to be used for arbitrary (3rd-party) files, so every file is uploaded via the existing arbitrary-file FileUpload path (no special-casing of data-capture files). If a .capture file is ever passed, it's uploaded as a plain file rather than parsed into the capture pipeline.
This depends on the api proto PR and is stacked on the wiring branch (base: APP-16855).

Changes

services/datamanager/builtin/builtin.go: added the UploadDataFromPath method on the builtin service, which locks and delegates to the sync layer. This is the method the wiring ticket's interface assertion looks for, so the call now reaches the data manager instead of erroring.
services/datamanager/builtin/sync/sync.go: implemented Sync.UploadDataFromPath. It guards on the cloud connection, os.Stats the path, and either walks a directory or handles a single file, uploading each via the arbitrary-file path and aggregating files_uploaded / files_failed / bytes_uploaded / bytes_total / ids. For directories, a dropped connection stops the walk mid-way, so partial success is reflected in the counts. Refactored syncArbitraryFile to take a context and return the uploaded file's id (and to accept a per-call byte counter so ad-hoc uploads don't pollute the scheduled-sync stats).
services/datamanager/builtin/sync/upload_arbitrary_file.go: uploadArbitraryFile now returns the cloud-assigned BinaryDataId from the FileUploadResponse (previously discarded), which is what populates the ids field.
services/datamanager/builtin/sync/upload_data_from_path_test.go: new unit tests — cloud-not-ready guard, single file, directory aggregation, partial failure, and id population.
robot/impl/local_robot_test.go: updated the wiring ticket's guard test, which now expects "not connected to the cloud" (since the builtin implements the method).
Testing

Unit Tests:

sync package (TestUploadDataFromPath): asserts the not-connected guard, a single file (counts/bytes/id correct and source deleted), a directory aggregating counts/bytes/ids, and a partial failure where bytes_total counts everything discovered while only successful files contribute to files_uploaded/bytes_uploaded. The mock cloud client returns a known BinaryDataId so id population is verified.
Confirmed the existing builtin and sync tests still pass against the uploadArbitraryFile/syncArbitraryFile signature changes (go test ./services/datamanager/... ./robot/... -count=1).
Manual Testing:
Tested UploadDataFromPath end-to-end against a cloud-managed machine running a from-source build of this branch, using a local generic module whose DoCommand dials the machine's RobotClient and calls the new API. The happy path works for both a single file and a 50-file directory — all files upload, the returned ids match what lands in the DATA tab, and source files are deleted after a successful upload. I then exercised the failure modes: a non-existent path returns a clean error with nothing uploaded; an empty directory returns zero uploaded/zero failed with no error; an unreadable file (chmod 000) is counted as files_failed: 1 without a fatal error and isn't deleted; a directory mixing one readable and one unreadable file returns a correct partial result (files_uploaded: 1, files_failed: 1, with bytes_uploaded reflecting only the good file and bytes_total counting both); cutting wifi mid-upload causes the data manager sync to enter TRANSIENT_FAILURE and wait-for-ready/retry rather than crash or lose data; and killing the server mid-upload triggers a graceful shutdown that closes the module and data manager cleanly without deleting the interrupted source. Finally, for a stress/concurrency test, I ran a fake sensor capturing at 10 Hz with scheduled sync enabled while triggering UploadDataFromPath on a 1.5 GB directory (300 × 5 MB files) — the upload completed fully (files_uploaded: 300, files_failed: 0, all 1,572,864,000 bytes), sensor capture stayed continuous at 10 Hz with no gaps throughout, both streams reached the cloud, and the logs stayed clean, confirming the new API coexists with normal data manager capture/sync under load without interference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant