Skip to content

Commit aaa9927

Browse files
Fix data access to external resources (#155)
* Refactor data access to use 'path' instead of 'url' in index records * Apply PR feedback for path-based data index helpers * Update data index reference to latest commit SHA * Update roadmap documentation
1 parent b3011a6 commit aaa9927

4 files changed

Lines changed: 132 additions & 84 deletions

File tree

docs/architecture/ROADMAP.md

Lines changed: 81 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,9 @@ Legend:
1818

1919
---
2020

21-
# 1. Sample Model
21+
# 1. Structure Model
2222

23-
## 1.1 Crystal Structure Parameters
23+
## 1.1 Crystal Structure
2424

2525
### Space Group
2626

@@ -48,35 +48,51 @@ Legend:
4848
| Occupancy |||
4949
| Symmetry _wyckoff_letter_ |||
5050

51-
### Atomic Displacement Parameters (ADP)
51+
### Atomic Displacement (ADP)
5252

53-
| Feature | LIB | APP |
54-
| ----------------------------------------------- | --- | --- |
55-
| Isotropic Biso || 🗓 |
56-
| Isotropic Uiso | 🚧 ||
57-
| Anisotropic Bani _B11, B22, B33, B12, B13, B23_ | 🚧 | 🗓 |
58-
| Anisotropic Uani _U11, U22, U33, U12, U13, U23_ | 🚧 | 🗓 |
53+
| Feature | LIB | APP |
54+
| --------------------------------------------------- | --- | --- |
55+
| Isotropic _Biso_ || 🗓 |
56+
| Isotropic _Uiso_ | 🚧 ||
57+
| Anisotropic _Bani_ (_B11, B22, B33, B12, B13, B23_) | 🚧 | 🗓 |
58+
| Anisotropic _Uani_ (_U11, U22, U33, U12, U13, U23_) | 🚧 | 🗓 |
5959

6060
---
6161

62-
## 1.2 Magnetic Structure Parameters
62+
## 1.2 Magnetic Structure - EPIC
6363

64-
| Feature | LIB | APP |
65-
| ------------------------------------------------------- | --- | --- |
66-
| EPIC (Magnetic space groups, unpolarized and polarized) | 🗓 | 🗓 |
64+
| Feature | LIB | APP |
65+
| ----------------------------------------------------- | --- | --- |
66+
| Magnetic Space Groups | 🗓 | 🗓 |
67+
| Irreducible representations | 🗓 | 🗓 |
68+
| Magnetic propagation vector (_kx, ky, kz_) | 🗓 | 🗓 |
69+
| Magnetic moments (_mx, my, mz_) | 🗓 | 🗓 |
70+
| Local Susceptibility (_𝜒11, 𝜒22, 𝜒33, 𝜒12, 𝜒13, 𝜒23_) | 🗓 | 🗓 |
6771

6872
---
6973

7074
# 2. Experiment Model
7175

72-
## 2.1 Powder Diffraction
73-
74-
### Fitting Methods
75-
76-
| Feature | LIB | APP |
77-
| ------------------------------------- | --- | --- |
78-
| Rietveld refinement (full pattern) |||
79-
| Le Bail refinement (profile matching) | 🗓 | 🗓 |
76+
| Techniques | LIB | APP |
77+
| ---------------------------------------------------- | ----- | ----- |
78+
| 2.1. Powder Diffraction | ✅/🗓 | ✅/🗓 |
79+
| 2.1.1. Common features | ✅/🗓 | ✅/🗓 |
80+
| 2.1.2. Standard Bragg diffraction (CWL) | ✅/🗓 | ✅/🗓 |
81+
| 2.1.2. Standard Bragg diffraction (TOF) | ✅/🗓 | ✅/🗓 |
82+
| 2.1.3. Total Scattering (Pair-Distribution Function) | ✅/🗓 | 🗓 |
83+
| 2.2. Single-Crystal Diffraction (CWL) | ✅/🗓 | ✅/🗓 |
84+
| 2.2. Single-Crystal Diffraction (TOF) | ✅/🗓 | ✅/🗓 |
85+
| 2.3. Polarized Powder Diffraction | 🗓 | 🗓 |
86+
| 2.3.1. Flipping-rathio method (TOF) | 🗓 | 🗓 |
87+
| 2.3.1. Flipping-rathio method (CWL) | 🗓 | 🗓 |
88+
| 2.4. Polarized Single-Crystal Diffraction | 🗓 | 🗓 |
89+
| 2.4.1. Flipping-rathio method (CWL) | 🗓 | 🗓 |
90+
| 2.4.2. Flipping-rathio method (TOF) | 🗓 | 🗓 |
91+
| 2.4.3. Spherical neutron polarimetry | 🗓 | 🗓 |
92+
93+
## 2.1. Powder Diffraction
94+
95+
## 2.1.1 Common features
8096

8197
### Linked Phases
8298

@@ -90,7 +106,14 @@ Legend:
90106
| ----------------------------------------- | --- | --- |
91107
| Multiple regions<br>_start/end positions_ || 🗓 |
92108

93-
---
109+
## 2.1.1 Standard Bragg diffraction
110+
111+
### Fitting Methods
112+
113+
| Feature | LIB | APP |
114+
| ------------------------------------- | --- | --- |
115+
| Rietveld refinement (full pattern) |||
116+
| Le Bail refinement (profile matching) | 🗓 | 🗓 |
94117

95118
### Background
96119

@@ -130,20 +153,12 @@ Legend:
130153

131154
### Peak Profile — Time-of-Flight
132155

133-
CrysPy peak_shape options:
134-
135-
- "Gauss": Jorgensen (back-to-back exponentials ⊗ Gaussian)
136-
- "pseudo-Voigt": Jorgensen-Von Dreele (back-to-back exponentials ⊗
137-
pseudo-Voigt)
138-
- "type0m": Double back-to-back exponentials ⊗ pseudo-Voigt (Z-Rietveld
139-
type 0m)
140-
141-
| Feature | LIB | APP |
142-
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | --- |
143-
| Jorgensen (back-to-back exponentials ⊗ Gaussian)<br>_Gaussian broadening σ₀, σ₁, σ₂<br>Back-to-back exponential rise α₀, α₁. Back-to-back exponential decay β₀, β₁_<br>(CrysPy) |||
144-
| Jorgensen-Von Dreele (back-to-back exponentials ⊗ pseudo-Voigt)<br>_Gaussian broadening σ₀, σ₁, σ₂. Lorentzian broadening γ₀, γ₁, γ₂<br>Back-to-back exponential rise α₀, α₁. Back-to-back exponential decay β₀, β₁_<br>(CrysPy) |||
145-
| Double back-to-back exponentials ⊗ pseudo-Voigt [Z-Rietveld type0m]<br>_Gaussian broadening σ₀, σ₁, σ₂. Lorentzian broadening γ₀, γ₁, γ₂<br>Rise α₁, α₂. Fast decay β₀₀, β₀₁. Slow decay β₁₀. Switching r₀₁, r₀₂, r₀₃_<br>(CrysPy) || 🗓 |
146-
| Ikeda-Carpenter ⊗ pseudo-Voigt<br>_Moderator pulse α₀, α₁, β₀, κ<br>Gaussian broadening σ². Lorentzian broadening γ_<br>(CrysFML) | 🗓 | 🗓 |
156+
| Feature | LIB | APP |
157+
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | --- |
158+
| Jorgensen (back-to-back exponentials ⊗ Gaussian)<br>_Gaussian broadening σ₀, σ₁, σ₂<br>Back-to-back exponential rise α₀, α₁. Back-to-back exponential decay β₀, β₁_<br>(CrysPy "Gauss") |||
159+
| Jorgensen-Von Dreele (back-to-back exponentials ⊗ pseudo-Voigt)<br>_Gaussian broadening σ₀, σ₁, σ₂. Lorentzian broadening γ₀, γ₁, γ₂<br>Back-to-back exponential rise α₀, α₁. Back-to-back exponential decay β₀, β₁_<br>(CrysPy "pseudo-Voigt") |||
160+
| Double back-to-back exponentials ⊗ pseudo-Voigt [Z-Rietveld type0m]<br>_Gaussian broadening σ₀, σ₁, σ₂. Lorentzian broadening γ₀, γ₁, γ₂<br>Rise α₁, α₂. Fast decay β₀₀, β₀₁. Slow decay β₁₀. Switching r₀₁, r₀₂, r₀₃_<br>(CrysPy "type0m") || 🗓 |
161+
| Ikeda-Carpenter ⊗ pseudo-Voigt<br>_Moderator pulse α₀, α₁, β₀, κ<br>Gaussian broadening σ². Lorentzian broadening γ_<br>(CrysFML) | 🗓 | 🗓 |
147162

148163
| TOF profile | TOF source | Performance |
149164
| ------------------------------------------------------------------- | ----------------------------------------------------------------- | ----------- |
@@ -154,13 +169,13 @@ CrysPy peak_shape options:
154169

155170
---
156171

157-
## 2.1.2 Total Scattering (Pair Distribution Function)
172+
## 2.1.3 Total Scattering (Pair Distribution Function)
158173

159174
### Peak Profile
160175

161-
| Feature | LIB | APP |
162-
| ------------------------------------------------------------------------------------------------------ | --- | --- |
163-
| GaussianDampedSinc type<br>_cutoff q. broadening q. sharpening δ₁, δ₂<br>damping q, particle diameter_ || 🗓 |
176+
| Feature | LIB | APP |
177+
| ------------------------------------------------------------------------------------------------------------------------ | --- | --- |
178+
| Gaussian-damped sinc termination function<br>_cutoff q. broadening q. sharpening δ₁, δ₂<br>damping q, particle diameter_ || 🗓 |
164179

165180
---
166181

@@ -196,11 +211,20 @@ Gauss or Lorentz mosaicity distribution
196211
| ------------------------------------ | --- | --- |
197212
| Individual wavelength per reflection || 🗓 |
198213

199-
## 2.3. Polarized Neutron Diffraction
214+
## 2.3. Polarized Neutron Powder Diffraction - EPIC
200215

201-
| Feature | LIB | APP |
202-
| ---------------------------------------------- | --- | --- |
203-
| EPIC (powders and single crystals, FR and SNP) | 🗓 | 🗓 |
216+
| Feature | LIB | APP |
217+
| ---------------------------- | --- | --- |
218+
| Flipping-rathio method (TOF) | 🗓 | 🗓 |
219+
| Flipping-rathio method (CWL) | 🗓 | 🗓 |
220+
221+
## 2.3. Polarized Neutron Single Crystal Diffraction - EPIC
222+
223+
| Feature | LIB | APP |
224+
| ----------------------------- | --- | --- |
225+
| Flipping-rathio method (TOF) | 🗓 | 🗓 |
226+
| Flipping-rathio method (CWL) | 🗓 | 🗓 |
227+
| Spherical neutron polarimetry | 🗓 | 🗓 |
204228

205229
---
206230

@@ -215,19 +239,21 @@ Gauss or Lorentz mosaicity distribution
215239

216240
# 4. Analysis (Fitting)
217241

218-
### Refinement Algorithms
242+
### Refinement Algorithms (numerical derivatives)
219243

220-
| Feature | LIB | APP |
221-
| --------------------------------------------------------------- | --- | --- |
222-
| Levenberg–Marquardt (numerical derivatives)<br>LMFIT minimizer |||
223-
| Levenberg–Marquardt (analytical derivatives)<br>LMFIT minimizer | 🗓 | 🗓 |
224-
| Derivative-free minimization<br>DFO-LS minimizer |||
225-
| Bayesian analysis<br>BUMPS minimizer | 🗓 | 🗓 |
244+
| Feature | LIB | APP |
245+
| ---------------------------------------------------- | --- | --- |
246+
| Levenberg–Marquardt<br>LMFIT minimizer |||
247+
| Levenberg–Marquardt<br>LMFIT minimizer (scipy-based) |||
248+
| Levenberg–Marquardt<br>BUMPS minimizer | 🚧 | 🗓 |
249+
| Derivative-free minimization<br>DFO-LS minimizer |||
250+
| Bayesian analysis<br>BUMPS minimizer | 🗓 | 🗓 |
226251

227252
### Fit Strategies
228253

229254
| Feature | LIB | APP |
230255
| ---------------------------------------------------------------------------------------------------- | --- | --- |
256+
| Single fit of one experimental data block to one/multiple structural data block |||
231257
| Sequential fit of experimental data blocks || 🗓 |
232258
| Joint fit of experimental data blocks within the same calculation engine || 🗓 |
233259
| Joint fit of experimental data blocks using different calculation engines<br>(e.g. CrysPy + Pdffit2) || 🗓 |
@@ -292,8 +318,8 @@ Gauss or Lorentz mosaicity distribution
292318

293319
| Feature | LIB | APP | CLI |
294320
| --------------------------------- | --- | --- | --- |
295-
| List available tutorial notebooks | |||
296-
| Download tutorial notebooks | |||
321+
| List available tutorial notebooks | |||
322+
| Download tutorial notebooks | |||
297323

298324
---
299325

@@ -371,12 +397,10 @@ Gauss or Lorentz mosaicity distribution
371397

372398
---
373399

374-
# 10. Future Topics
375-
376-
Here, we list features that are not sorted into the above categories,
377-
but are still on our radar for future development.
400+
# 10. Unsorted features
378401

379402
- Restrains (soft constraints, e.g. bond lengths, angles)
403+
- Refinement using analytical derivatives
380404
- Global optimization algorithms (e.g. simulated annealing)
381405
- Incommensurate structures
382406
- 2D Rietveld refinement

src/easydiffraction/utils/utils.py

Lines changed: 30 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,25 @@
2525

2626
pooch.get_logger().setLevel('WARNING') # Suppress pooch info messages
2727

28+
_DATA_REPO = 'easyscience/diffraction'
29+
_DATA_ROOT = 'data'
2830
# commit SHA preferred
29-
_DATA_INDEX_REF = '010c69546fa9ec1bd998bdcaa902e1df4f5d10af'
31+
_DATA_INDEX_REF = '927f96547a80c3328f43f2a69cb9c8048286bcb7'
3032
# macOS: sha256sum index.json
31-
_DATA_INDEX_HASH = 'sha256:9449dbba0475158bbce9dea1fbb1e5e596c1f63d41fc136a3e3f5d677c5c6779'
33+
_DATA_INDEX_HASH = 'sha256:301d6aafdc1ccf5f97d2edb491a6b350f6195f05106f8f38c9bf5530e592c8ec'
34+
35+
36+
def _build_data_url(path: str) -> str:
37+
path = path.lstrip('/')
38+
return f'https://raw.githubusercontent.com/{_DATA_REPO}/{_DATA_INDEX_REF}/{_DATA_ROOT}/{path}'
39+
40+
41+
def _record_path(record: dict) -> str:
42+
if 'path' in record:
43+
return record['path']
44+
45+
msg = "Index record must contain 'path' key."
46+
raise KeyError(msg)
3247

3348

3449
def _validate_url(url: str) -> None:
@@ -51,9 +66,13 @@ def _validate_url(url: str) -> None:
5166
raise ValueError(msg)
5267

5368

54-
def _filename_for_id_from_url(data_id: int | str, url: str) -> str:
55-
"""Return local filename using the extension from the URL."""
56-
suffix = pathlib.Path(urlparse(url).path).suffix # includes leading dot ('.cif', '.xye', ...)
69+
def _filename_for_id_from_path(data_id: int | str, record_path: str) -> str:
70+
"""
71+
Return local filename using the extension from the record path.
72+
"""
73+
suffix = pathlib.PurePosixPath(
74+
record_path
75+
).suffix # includes leading dot ('.cif', '.xye', ...)
5776
# If URL has no suffix, fall back to no extension.
5877
return f'ed-{data_id}{suffix}'
5978

@@ -74,10 +93,7 @@ def _normalize_known_hash(value: str | None) -> str | None:
7493

7594
def _fetch_data_index() -> dict:
7695
"""Fetch and cache the diffraction data index.json."""
77-
index_url = (
78-
'https://raw.githubusercontent.com/easyscience/diffraction/'
79-
f'{_DATA_INDEX_REF}/data/index.json'
80-
)
96+
index_url = _build_data_url('index.json')
8197
_validate_url(index_url)
8298

8399
destination_dirname = 'easydiffraction'
@@ -170,11 +186,10 @@ def download_data(
170186
raise KeyError(msg)
171187

172188
record = index[key]
173-
url = record['url']
189+
record_path = _record_path(record)
190+
url = _build_data_url(record_path)
174191
_validate_url(url)
175-
176-
known_hash = _normalize_known_hash(record.get('hash'))
177-
fname = _filename_for_id_from_url(id, url)
192+
fname = _filename_for_id_from_path(id, record_path)
178193

179194
dest_path = pathlib.Path(destination)
180195
dest_path.mkdir(parents=True, exist_ok=True)
@@ -197,6 +212,8 @@ def download_data(
197212
log.debug(f"Data #{id} already present at '{file_path}', but will be overwritten.")
198213
file_path.unlink()
199214

215+
known_hash = _normalize_known_hash(record.get('hash'))
216+
200217
# Pooch downloads to destination with our controlled filename.
201218
pooch.retrieve(
202219
url=url,

tests/unit/easydiffraction/test___init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ def test_lazy_functions_execute_with_monkeypatch(monkeypatch, capsys, tmp_path):
5353

5454
fake_index = {
5555
'12': {
56-
'url': 'https://example.com/data.xye',
56+
'path': 'data.xye',
5757
'hash': 'sha256:...',
5858
'description': 'Demo dataset',
5959
}
@@ -72,4 +72,4 @@ def fake_retrieve(**kwargs):
7272

7373
result = utils.download_data(id=12, destination=str(tmp_path), overwrite=True)
7474
assert Path(result).exists()
75-
assert calls['kwargs']['url'] == 'https://example.com/data.xye'
75+
assert calls['kwargs']['url'] == utils._build_data_url('data.xye')

0 commit comments

Comments
 (0)