You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All new methods must be async with `_async` suffix. The `@async_to_sync` class decorator (`core/async_utils.py`) auto-generates sync counterparts at class definition time. Never write sync methods manually on model classes — the decorator handles it.
40
40
41
+
### `wrap_async_to_sync()` for standalone functions
42
+
Use `wrap_async_to_sync()` (not `@async_to_sync`) for free-standing async functions outside of classes — see `operations/` layer for the pattern. The class decorator only works on classes.
43
+
41
44
### Protocol classes for sync type hints
42
45
Each model in `models/` has a corresponding protocol in `models/protocols/` defining the sync method signatures. When adding a new async method to a model, add its sync signature to the protocol class so IDE type hints work.
43
46
44
47
### Dataclass models with `fill_from_dict()`
45
48
Models are `@dataclass` classes, NOT Pydantic. REST responses are deserialized via `fill_from_dict()` methods on each model. New models must follow this pattern.
46
49
47
50
### Concrete types are Java class names
48
-
`core/constants/concrete_types.py` maps Java class names (e.g., `org.sagebionetworks.repo.model.FileEntity`) for polymorphic entity deserialization. When adding new entity types, register the concrete type string here.
51
+
`core/constants/concrete_types.py` maps Java class names (e.g., `org.sagebionetworks.repo.model.FileEntity`) for polymorphic entity deserialization. When adding new entity types, register the concrete type string here AND in `api/entity_factory.py` AND in `models/mixins/asynchronous_job.py` if it's an async job type.
52
+
53
+
### Options dataclass pattern
54
+
The `operations/` layer uses dataclass option objects (`StoreFileOptions`, `FileOptions`, `TableOptions`, etc.) to bundle type-specific configuration for CRUD operations. Follow this pattern for new entity-type-specific options.
49
55
50
56
### Mixin composition for shared behavior
51
57
Shared functionality lives in `models/mixins/` (AccessControllable, StorableContainer, AsynchronousJob, etc.). Prefer adding to existing mixins over duplicating logic across models.
@@ -60,23 +66,30 @@ Use `SYNPY-{issue_number}` or `synpy-{issue_number}` prefix for feature branches
60
66
61
67
```
62
68
synapseclient/
63
-
├── client.py # Synapse class — public entry point, REST methods, auth
64
-
├── api/ # REST API layer — one file per resource type
Data flows: Client REST methods → API service functions → Models with `fill_from_dict()` → returned to caller. The `operations/` layer provides a simpler interface over this chain.
92
+
Data flow: User → `operations/` factory → model async methods → `api/` service functions → `client.py` REST calls → Synapse API. Responses deserialized via `fill_from_dict()` on model instances.
80
93
81
94
## Constraints
82
95
@@ -85,6 +98,8 @@ Data flows: Client REST methods → API service functions → Models with `fill_
85
98
- Unit tests must not make network calls — `pytest-socket` blocks all sockets. Use `pytest-mock` for HTTP mocking.
86
99
-`develop` is the default/main branch, not `main` or `master`. PRs target `develop`.
87
100
- Legacy classes in root `synapseclient/` (entity.py, table.py, etc.) are kept for backwards compatibility. New features go in `models/` using the dataclass pattern.
101
+
- Avoid adding new methods to `client.py` (9600+ lines) — prefer the `api/` + `models/` layered pattern.
102
+
-`synapseutils/` is legacy sync-only (uses `requests`, NOT `httpx`). Do not add async methods there — new async equivalents go in `models/` or `operations/`.
User-facing documentation for the Synapse Python Client. Built with MkDocs + Material theme, deployed via GitHub Pages. Follows the Diataxis documentation framework with four content types: tutorials, guides, reference, and explanations.
6
+
7
+
## Stack
8
+
9
+
MkDocs with Material theme, mkdocstrings (Google-style docstrings), termynal (CLI animations), markdown-include (file embedding).
10
+
11
+
## Conventions
12
+
13
+
### Content types (Diataxis framework)
14
+
-**tutorials/** — Step-by-step learning (competence-building). Themed around a biomedical researcher working with Alzheimer's Disease data. Progressive build-up: Project → Folder → File → Annotations → etc.
15
+
-**guides/** — How-to guides for specific use cases (problem-solution oriented). Includes extension-specific guides (curator).
16
+
-**reference/** — API reference auto-generated from docstrings via mkdocstrings. Split into `experimental/sync/` and `experimental/async/` for new OOP API.
17
+
-**explanations/** — Deep conceptual content ("why" not just "how"). Design decisions, internal machinery.
18
+
19
+
### File inclusion pattern (markdown-include)
20
+
Tutorial code lives in `tutorials/python/tutorial_scripts/*.py` and is embedded in markdown via line-range includes:
Single source of truth — edit the `.py` file, not the markdown. Changing line numbers in scripts requires updating the line ranges in the corresponding `.md` files.
25
+
26
+
### mkdocstrings reference generation
27
+
Reference markdown files use `::: synapseclient.ClassName` syntax to trigger auto-generation from docstrings. Key configuration:
-`members_order: source` — preserve source code order
30
+
-`filters: ["!^_", "!to_synapse_request", "!fill_from_dict"]` — private members, `to_synapse_request()`, and `fill_from_dict()` are excluded from docs
31
+
-`inherited_members: true` — shows mixin methods on inheriting classes
32
+
- Member lists are explicit — each reference page specifies which methods to document
33
+
34
+
### Anchor links for cross-referencing
35
+
Pattern: `[](){ #reference-anchor }` in reference pages. Tutorials link to reference via `[API Reference][project-reference-sync]`. Explicit type hints use: `[syn.login][synapseclient.Synapse.login]`.
36
+
37
+
### termynal CLI animations
38
+
Terminal animation blocks marked with `<!-- termynal -->` HTML comment. Prompts configured as `$` or `>`. Used in authentication.md and installation docs.
39
+
40
+
### Custom CSS (`css/custom.css`)
41
+
- API reference indentation: `doc-contents` has 25px left padding with border
42
+
- Smaller table font (0.7rem) for API docs
43
+
- Wide layout: `max-width: 1700px` for complex content
44
+
45
+
### Navigation structure
46
+
Defined in `mkdocs.yml` nav section. 5 main sections: Home, Tutorials, How-To Guides, API Reference, Further Reading, News. API Reference has ~85 markdown files (~40 legacy, ~45 experimental).
47
+
48
+
## Constraints
49
+
50
+
- Do not edit tutorial code inline in markdown — edit the `.py` script file in `tutorial_scripts/` and update line ranges if needed.
51
+
- Reference docs auto-generate from source docstrings — to change method documentation, edit the docstring in the Python source, not the markdown.
52
+
-`mkdocs.yml` is at the repo root, not in `docs/` — it configures the entire doc build.
53
+
- Docs deploy via `mkdocs gh-deploy --force` targeting the `master` branch (not `develop`).
Available methods: `rest_get_async`, `rest_post_async`, `rest_put_async`, `rest_delete_async`. Pass `endpoint=client.fileHandleEndpoint` for file handle operations; omit for the default repository endpoint.
28
+
Available methods: `rest_get_async`, `rest_post_async`, `rest_put_async`, `rest_delete_async`. Pass `endpoint=client.fileHandleEndpoint` for file handle operations; omit for the default repository endpoint. Use `json.dumps()` for request bodies — not raw dicts.
29
29
30
30
### Return values
31
31
- Most functions return raw `Dict[str, Any]` — transformation happens in the model layer via `fill_from_dict()`
32
-
- Some return dataclass instances (e.g., `EntityHeader`) when the data is only used internally
32
+
- Some return typed dataclass instances (e.g., `EntityHeader` from `entity_services.py`) when the data is only used internally
33
33
- Delete operations return `None`
34
34
35
35
### Pagination
36
36
Use helpers from `api_client.py`:
37
-
-`rest_get_paginated_async()` — for GET endpoints with limit/offset. Expects `results` or `children` key.
38
-
-`rest_post_paginated_async()` — for POST endpoints with `nextPageToken`. Expects `page` array.
37
+
-`rest_get_paginated_async()` — for GET endpoints with limit/offset. Expects `results` or `children` key in response.
38
+
-`rest_post_paginated_async()` — for POST endpoints with `nextPageToken`. Expects `page` array in response.
39
39
Both are async generators yielding individual items.
40
40
41
+
### Entity factory (`entity_factory.py`)
42
+
Polymorphic entity deserialization via concrete type dispatch. Maps Java class names from `core/constants/concrete_types.py` to model classes. When adding a new entity type, register the type mapping here.
43
+
41
44
### Adding a new service file
42
45
1. Create `synapseclient/api/new_service.py`
43
-
2.Import and add all public functions to `api/__init__.py` and its `__all__`
46
+
2.Add all public functions to `api/__init__.py`imports and `__all__` — every public function must be re-exported
44
47
3. Use `json.dumps()` for request bodies (not dict)
45
48
4. Reference `entity_services.py` for CRUD pattern, `table_services.py` for pagination pattern
-`with_retry_time_based_async()` — time-bounded (default 20 min), exponential backoff with 0.01-0.1 random jitter
19
21
- Default retryable status codes: `[429, 500, 502, 503, 504]`
20
-
-`NON_RETRYABLE_ERRORS` list overrides status code retry (e.g., "is not a table or view")
22
+
-`NON_RETRYABLE_ERRORS` list overrides status code retry (currently: `["is not a table or view"]`)
21
23
- 429 throttling: wait bumps to 16 seconds minimum
24
+
- Sets OTel span attribute `synapse.retries` on retry
22
25
23
26
### Credentials chain (`credentials/`)
24
27
Provider chain tries in order: login args → config file → env var (`SYNAPSE_AUTH_TOKEN`) → AWS SSM. Credentials implement `requests.auth.AuthBase`, adding `Authorization: Bearer` header. Profile selection via `SYNAPSE_PROFILE` env var or `--profile` arg.
@@ -30,7 +33,28 @@ Provider chain tries in order: login args → config file → env var (`SYNAPSE_
30
33
- Progress via `tqdm`; multi-threaded uploads suppress per-file messages via `cumulative_transfer_progress`
31
34
32
35
### concrete_types.py
33
-
Maps Java class names from Synapse REST API for polymorphic deserialization. When adding a new entity type, add its concrete type string here AND in `api/entity_factory.py` type map.
36
+
Maps Java class names from Synapse REST API for polymorphic deserialization. When adding a new entity type, add its concrete type string here AND in `api/entity_factory.py` type map AND in `models/mixins/asynchronous_job.py` ASYNC_JOB_URIS if it's an async job type.
37
+
38
+
### Key reusable utilities (`utils.py`)
39
+
-`delete_none_keys(d)` — removes None-valued keys from dict. MUST call before all API requests — Synapse rejects null values.
40
+
-`id_of(obj)` — extracts Synapse ID from entity, dict, or string
41
+
-`concrete_type_of(entity)` — gets the concrete type string from an entity
42
+
-`get_synid_and_version(id_str)` — parses "synXXX.N" strings into (id, version) tuples
43
+
-`merge_dataclass_entities(source, dest, ...)` — merges fields from one dataclass into another
44
+
-`log_dataclass_diff(obj1, obj2)` — logs field-by-field differences between two dataclass instances
45
+
-`snake_case(name)` — converts camelCase to snake_case
46
+
-`normalize_whitespace(s)` — collapses whitespace
47
+
-`MB`, `KB`, `GB` — byte size constants
48
+
-`make_bogus_data_file()`, `make_bogus_binary_file(n)`, `make_bogus_uuid_file()` — test file generators (in production code, used by tests)
49
+
50
+
### Exception hierarchy (`exceptions.py`)
51
+
`SynapseError` base with 14+ subclasses: `SynapseHTTPError`, `SynapseMd5MismatchError`, `SynapseFileNotFoundError`, `SynapseNotFoundError`, `SynapseAuthenticationError`, etc. `_raise_for_status()` and `_raise_for_status_httpx()` handle HTTP error responses with Bearer token redaction via `BEARER_TOKEN_PATTERN` regex.
52
+
53
+
### Rolled-up subdirectories
54
+
55
+
**`core/models/`** — Internal dataclasses for ACL, Permission, DictObject (dict-like base class), and custom JSON serialization utilities. `DictObject` (`dict_object.py`) provides dot-notation access to dict entries.
56
+
57
+
**`core/multithread_download/`** — Threaded download manager with `shared_executor()` context manager for external thread pool configuration. Uses `DownloadRequest` dataclass. Default part size: `SYNAPSE_DEFAULT_DOWNLOAD_PART_SIZE`.
Maps Java class name strings (e.g., `org.sagebionetworks.repo.model.FileEntity`) for polymorphic entity deserialization. When adding a new entity or job type, register in THREE places:
11
+
1.`concrete_types.py` — add the constant string
12
+
2.`api/entity_factory.py` — add to the type dispatch map
13
+
3.`models/mixins/asynchronous_job.py``ASYNC_JOB_URIS` — add if it's an async job type
14
+
15
+
### limits.py
16
+
`MAX_FILE_HANDLE_PER_COPY_REQUEST = 100` and other API batch size limits.
17
+
18
+
### method_flags.py
19
+
Collision handling modes for file downloads: `COLLISION_OVERWRITE_LOCAL`, `COLLISION_KEEP_LOCAL`, `COLLISION_KEEP_BOTH`.
20
+
21
+
### config_file_constants.py
22
+
Section and key names for the `~/.synapseConfig` file. `AUTHENTICATION_SECTION_NAME` identifies the auth section.
Authentication credential providers implementing a chain-of-responsibility pattern for token resolution.
6
+
7
+
## Conventions
8
+
9
+
### Provider chain order (priority)
10
+
1.**UserArgsCredentialsProvider** — explicit login args passed to `syn.login()`
11
+
2.**ConfigFileCredentialsProvider** — `~/.synapseConfig` file (profile-aware via sections)
12
+
3.**EnvironmentVariableCredentialsProvider** — `SYNAPSE_AUTH_TOKEN` env var
13
+
4.**AWSParameterStoreCredentialsProvider** — AWS SSM Parameter Store (via `SYNAPSE_TOKEN_AWS_SSM_PARAMETER_NAME` env var)
14
+
15
+
### Profile selection
16
+
Select profile via `SYNAPSE_PROFILE` env var or `--profile` CLI arg. If username provided in login args differs from config file username, config credentials are rejected — prevents ambiguity.
17
+
18
+
### Token handling
19
+
`SynapseAuthTokenCredentials` implements `requests.auth.AuthBase`, adding `Authorization: Bearer` header. JWT validation failure is silent (logs warning, does not raise) — allows tokens with unrecognized formats to attempt API calls.
20
+
21
+
## Constraints
22
+
23
+
- Bearer tokens must never appear in logs — redact with `BEARER_TOKEN_PATTERN` regex before logging.
File download from Synapse storage with MD5 validation, collision handling, and progress tracking.
6
+
7
+
## Conventions
8
+
9
+
### Primary download path
10
+
`download_async.py` is the primary async download implementation. `download_functions.py` contains shared helpers and the sync download wrapper.
11
+
12
+
### MD5 validation
13
+
Post-transfer MD5 validation is mandatory. Raises `SynapseMd5MismatchError` on mismatch — the download is retried automatically (60 retries spanning ~30 minutes).
14
+
15
+
### Collision handling
16
+
Controlled by `if_collision` parameter, using constants from `core/constants/method_flags.py`:
17
+
-`overwrite.local` — replace existing local file
18
+
-`keep.local` — skip download if local file exists
19
+
-`keep.both` — rename downloaded file to avoid collision
20
+
21
+
### Progress tracking
22
+
Uses `shared_download_progress_bar` from `core/transfer_bar.py` for tqdm-based progress. Multi-file downloads track cumulative progress via `cumulative_transfer_progress`.
0 commit comments