|
| 1 | +# Troubleshooting |
| 2 | + |
| 3 | +## Where is ...? |
| 4 | + |
| 5 | +### Instrument-specific bluesky code |
| 6 | + |
| 7 | +This is located in each instrument's `inst` script area, with their other instrument-specific scripts. |
| 8 | + |
| 9 | +`c:\Instrument\Settings\config\NDX<inst>\configurations\Python\inst\bluesky` |
| 10 | + |
| 11 | +### This library (`ibex_bluesky_core`) |
| 12 | + |
| 13 | +In a clean installation of IBEX, `ibex_bluesky_core` is installed as a python module via pip into the |
| 14 | +[uktena](https://github.com/IsisComputingGroup/uktena) environment which is in `c:\instrument\apps\python3`. |
| 15 | + |
| 16 | +However, for instruments which have been testing bluesky and may have versions of `ibex_bluesky_core` with local |
| 17 | +modifications, a version of `ibex_bluesky_core` may be checked out into `c:\instrument\dev\ibex_bluesky_core` on the |
| 18 | +instrument, and this will have been editable-installed into the `uktena` environment. |
| 19 | + |
| 20 | +### Other bluesky libraries (`ophyd_async`, `bluesky`, ...) |
| 21 | + |
| 22 | +`ophyd_async` and `bluesky` are installed as python modules. via pip, into the |
| 23 | +[uktena](https://github.com/IsisComputingGroup/uktena) environment which is in `c:\instrument\apps\python3` because |
| 24 | +they are specified in `pyproject.toml` of `ibex_bluesky_core`, which itself is a dependency of `uktena`. |
| 25 | + |
| 26 | +### Logs |
| 27 | + |
| 28 | +Log files are stored in `C:\Instrument\Var\logs\bluesky`. |
| 29 | + |
| 30 | +```{note} |
| 31 | +Older log files will be moved by the log rotation script to |
| 32 | +`\\isis\inst$\Backups$\stage-deleted\ndx<instrument>\Instrument\Var\logs\bluesky` |
| 33 | +``` |
| 34 | + |
| 35 | +The default log level is `INFO` and all messages from `ibex_bluesky_core`, `bluesky` and `ophyd_async` are captured. |
| 36 | + |
| 37 | +If you need to increase this to `DEBUG` to isolate an issue, you can do so using |
| 38 | +{py:obj}`ibex_bluesky_core.log.set_bluesky_log_levels`. |
| 39 | + |
| 40 | +### Scientist-facing data |
| 41 | + |
| 42 | +Scientist-facing output files are written to `\\isis\inst$\NDX<inst>\user\test\scans\<current rb number>` by default. |
| 43 | + |
| 44 | +The path can be customized by passing extra arguments to |
| 45 | +{py:obj}`ibex_bluesky_core.callbacks.file_logger.HumanReadableFileCallback` for the "human-readable" files, or |
| 46 | +{py:obj}`ibex_bluesky_core.callbacks.fitting.livefit_logger.LiveFitLogger` for the fit output files. |
| 47 | + |
| 48 | +### Raw diagnostic data |
| 49 | + |
| 50 | +Raw documents emitted by bluesky are stored in `C:\Instrument\Var\logs\bluesky\raw_documents` - these show the raw |
| 51 | +data emitted by bluesky scans. The filenames in this directory correspond to bluesky's scan ID, which is printed to |
| 52 | +the console at the end of each scan, and is also included as metadata in the scientist-facing output files. |
| 53 | + |
| 54 | +These files are written by {py:obj}`ibex_bluesky_core.callbacks.document_logger.DocLoggingCallback`, which is subscribed |
| 55 | +to the run engine by default in {py:obj}`ibex_bluesky_core.run_engine.get_run_engine`. |
| 56 | + |
| 57 | +```{note} |
| 58 | +Older raw documents will be moved by the log rotation script to |
| 59 | +`\\isis\inst$\Backups$\stage-deleted\ndx<instrument>\Instrument\Var\logs\bluesky\raw_documents` |
| 60 | +``` |
| 61 | + |
| 62 | +--- |
| 63 | + |
| 64 | +## How do I ...? |
| 65 | + |
| 66 | +### Run a plan |
| 67 | + |
| 68 | +To run a plan, you need to pass it to the `RE` object, which is available by default in the top-level python shell in |
| 69 | +the GUI. |
| 70 | + |
| 71 | +For example: |
| 72 | + |
| 73 | +``` |
| 74 | +>>> RE(some_clever_plan(which, takes, some, arguments)) |
| 75 | +``` |
| 76 | + |
| 77 | +Note that without the `RE(...)` call, the plan **does nothing**: |
| 78 | + |
| 79 | +``` |
| 80 | +>>> some_clever_plan(which, takes, some, arguments) |
| 81 | +<generator object scan at 0x000002286311D080> |
| 82 | +``` |
| 83 | + |
| 84 | +### Pause/stop/resume/abort a plan |
| 85 | + |
| 86 | +While a plan is running, a single ctrl-c will stop gracefully after the next point has finished counting. A message |
| 87 | +like the following will be printed: |
| 88 | + |
| 89 | +> A 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. |
| 90 | +> To pause immediately, hit Ctrl+C again in the next 10 seconds. |
| 91 | +> Deferred pause acknowledged. Continuing to checkpoint. |
| 92 | +
|
| 93 | +Two ctrl-c keystrokes within 10 seconds of each other will stop the plan immediately. |
| 94 | + |
| 95 | +When a plan is paused, you will get a short exception message with a summary of your options: |
| 96 | + |
| 97 | +``` |
| 98 | +Pausing... |
| 99 | +Traceback (most recent call last): |
| 100 | + File "C:\Instrument\Apps\Python3\Lib\site-packages\IPython\core\interactiveshell.py", line 3579, in run_code |
| 101 | + exec(code_obj, self.user_global_ns, self.user_ns) |
| 102 | + File "<ipython-input-27-288a69b9f38e>", line 1, in <module> |
| 103 | + RE(bps.sleep(999)) |
| 104 | + File "C:\Instrument\Apps\Python3\Lib\site-packages\bluesky\run_engine.py", line 976, in __call__ |
| 105 | + raise RunEngineInterrupted(self.pause_msg) from None |
| 106 | +bluesky.utils.RunEngineInterrupted: |
| 107 | +Your RunEngine is entering a paused state. These are your options for changing |
| 108 | +the state of the RunEngine: |
| 109 | +
|
| 110 | +RE.resume() Resume the plan. |
| 111 | +RE.abort() Perform cleanup, then kill plan. Mark exit_stats='aborted'. |
| 112 | +RE.stop() Perform cleanup, then kill plan. Mark exit_status='success'. |
| 113 | +RE.halt() Emergency Stop: Do not perform cleanup --- just stop. |
| 114 | +``` |
| 115 | + |
| 116 | +At this point, you will be returned to an interactive shell. You must choose one of these options above before |
| 117 | +attempting further operations with the `RunEngine`. |
| 118 | + |
| 119 | +- `RE.resume()` - resumes the scan from the point at which it was interrupted. The plan will complete as usual. |
| 120 | +- `RE.stop()` and `RE.abort()` are functionally identical - the plan will gracefully terminate, including running any |
| 121 | +cleanup actions and calling registered callbacks. |
| 122 | +- `RE.halt()` tells the RunEngine to drop dead and not do **anything** on the way out. This includes cleanup handlers, |
| 123 | +which for example may return motors to sensible positions or return DAE configuration to a state in which data can be |
| 124 | +taken. |
| 125 | + |
| 126 | +If you are unsure exactly what state your `RunEngine` is in, `RE.state` can be used to check this. If you attempt to |
| 127 | +run another plan before choosing one of the options above, you will get an exception like: |
| 128 | + |
| 129 | +``` |
| 130 | +Traceback (most recent call last): |
| 131 | + File "C:\Instrument\Apps\Python3\Lib\site-packages\IPython\core\interactiveshell.py", line 3579, in run_code |
| 132 | + exec(code_obj, self.user_global_ns, self.user_ns) |
| 133 | + File "<ipython-input-29-288a69b9f38e>", line 1, in <module> |
| 134 | + RE(bps.sleep(999)) |
| 135 | + File "C:\Instrument\Apps\Python3\Lib\site-packages\bluesky\run_engine.py", line 920, in __call__ |
| 136 | + raise RuntimeError(f"The RunEngine is in a {self._state} state") |
| 137 | +RuntimeError: The RunEngine is in a paused state |
| 138 | +``` |
| 139 | + |
| 140 | +This means that you still need to choose how to continue from the previous interruption. |
| 141 | + |
| 142 | +### Get the result of a plan |
| 143 | + |
| 144 | +A plan will return a `RunEngineResult` object, which has a `plan_result` attribute: |
| 145 | + |
| 146 | +``` |
| 147 | +result = RE(some_plan()) |
| 148 | +result.plan_result |
| 149 | +``` |
| 150 | + |
| 151 | +If a plan was interrupted and resumed later, the result is returned by the `RE.resume()` call: |
| 152 | + |
| 153 | +``` |
| 154 | +RE(some_plan()) |
| 155 | +<KeyboardInterrupt> |
| 156 | +result = RE.resume() |
| 157 | +result.plan_result |
| 158 | +``` |
| 159 | + |
| 160 | +```{tip} |
| 161 | +If the result wasn't captured, it's still possible to access it in an IPython interactive shell using the special |
| 162 | +underscore variables, for example `result = _`. |
| 163 | +
|
| 164 | +This is not specific to bluesky - it's an IPython feature where `_` is bound to the return value of the last expression, |
| 165 | +`__` is bound to the result of the second-to-last expression, and so on. |
| 166 | +``` |
| 167 | + |
| 168 | +[Do not try to "hide" the `RunEngine` in a script/function](https://blueskyproject.io/bluesky/main/tutorial.html#plans-in-series). |
| 169 | +The `RE(...)` call should always be typed by the user, at the terminal. |
| 170 | + |
| 171 | +### Connect a device |
| 172 | + |
| 173 | +Bluesky plans need devices to be connected up-front. This is so that errors (such as PVs not existing) are detected |
| 174 | +as soon as possible, not mid-way through a scan. |
| 175 | + |
| 176 | +Top-level plan wrappers should use the `ensure_connected` plan stub from `ophyd_async`: |
| 177 | + |
| 178 | +```python |
| 179 | +from ophyd_async.plan_stubs import ensure_connected |
| 180 | + |
| 181 | + |
| 182 | +def top_level_plan(dae, block): |
| 183 | + yield from ensure_connected(dae, block) |
| 184 | + yield from scan(...) |
| 185 | +``` |
| 186 | + |
| 187 | +If you forget to do this, you will get a stack trace containing: |
| 188 | + |
| 189 | +``` |
| 190 | +NotImplementedError: No PV has been set as connect() has not been called |
| 191 | +``` |
| 192 | + |
| 193 | +### Debug `NotConnected` errors |
| 194 | + |
| 195 | +If `ophyd_async` cannot connect to a PV, you will get an error that looks like: |
| 196 | + |
| 197 | +``` |
| 198 | +sample_changer2: NotConnected: |
| 199 | + readback: NotConnected: ca://TE:NDW2922:CS:SB:sample_changer2 |
| 200 | +``` |
| 201 | + |
| 202 | +In the first instance, check that the relevant PV exists if you get it via `caget`. |
| 203 | + |
| 204 | +If the device is a {py:obj}`block_rw<ibex_bluesky_core.devices.block.BlockRw>`, but the `:SP` record does not exist, |
| 205 | +you may use a `block_w` instead. This will both read and write to the same PV. Similarly, if `blockname:SP:RBV` does |
| 206 | +not exist, a {py:obj}`block_rw_rbv<ibex_bluesky_core.devices.block.BlockRwRbv>` cannot be used. |
| 207 | + |
| 208 | +If the device is a {py:obj}`ibex_bluesky_core.devices.reflectometry.ReflParameter` and `redefine` fails to connect, |
| 209 | +pass `has_redefine=False` when constructing that parameter (this means you won't be able to redefine the position |
| 210 | +of this refl parameter). |
| 211 | + |
| 212 | +If instead you get a `TypeError` that looks like: |
| 213 | + |
| 214 | +``` |
| 215 | +sample_changer: NotConnected: |
| 216 | + readback: TypeError: TE:NDW2922:CS:SB:sample_changer with inferred datatype float cannot be coerced to str |
| 217 | +``` |
| 218 | + |
| 219 | +This is because the **datatype of the underlying PV** does not match the **declared type in bluesky**. `ophyd_async` |
| 220 | +will not allow you to connect a `block_r(str, "some_block")` if `"some_block"` is a float-type PV. Every signal in |
| 221 | +`ophyd_async` is strongly typed. |
| 222 | + |
| 223 | +### Change how `set` on a device behaves |
| 224 | + |
| 225 | +For a {py:obj}`writable block<ibex_bluesky_core.devices.block.BlockRw>`, a `write_config` argument can be specified. |
| 226 | +See the options on {py:obj}`BlockWriteConfig<ibex_bluesky_core.devices.block.BlockWriteConfig>` for detailed options. These are specified |
| 227 | +when first creating the `block` object, which is likely to be in |
| 228 | +`\Instrument\Settings\config\NDX<inst>\configurations\Python\inst\bluesky\devices.py`, or dynamically created as part |
| 229 | +of a wrapper plan. |
| 230 | + |
| 231 | +For complex cases, where a `set` being complete depends on multiple conditions, a custom `ophyd_async` device is usually |
| 232 | +the cleanest way to accomplish this. These can be defined in an instrument's `inst` scripts area if they are specific |
| 233 | +to one instrument, or can be promoted to `ibex_bluesky_core` if generally useful. |
| 234 | + |
| 235 | +### Add extra sleeps into a plan |
| 236 | + |
| 237 | +Generally you should not need to insert sleeps at the *plan* level - prefer instead to modify *devices* so that they |
| 238 | +accurately report when they have finished setting (including any necessary settle times, for example). Writable blocks |
| 239 | +have a {py:obj}`BlockWriteConfig<ibex_bluesky_core.devices.block.BlockWriteConfig>` which has a `settle_time_s` |
| 240 | +parameter for this purpose. |
| 241 | + |
| 242 | +If you *really* need to sleep in a plan, rather than at the device level, use `yield from bps.sleep(duration_seconds)`. |
| 243 | +Do **not** just call `time.sleep()` in a plan - this will stall the event loop and interfere with keyboard interrupts, |
| 244 | +for example. |
| 245 | + |
| 246 | +### Add exception-handling to a plan |
| 247 | + |
| 248 | +In plans, because they are generators, you cannot just use `try-finally` or `try-except`. |
| 249 | + |
| 250 | +Instead, bluesky provides wrappers that implement error handling: |
| 251 | +[`finalize_wrapper`](https://blueskyproject.io/bluesky/main/generated/bluesky.preprocessors.contingency_wrapper.html#bluesky.preprocessors.contingency_wrapper) |
| 252 | +and |
| 253 | +[`contingency_wrapper`](https://blueskyproject.io/bluesky/main/generated/bluesky.preprocessors.finalize_wrapper.html#bluesky.preprocessors.finalize_wrapper). |
| 254 | + |
| 255 | +### Debug DAE failing to begin |
| 256 | + |
| 257 | +Check the DAE can begin a run without using bluesky. |
| 258 | + |
| 259 | +If you are counting using periods, and using the spectrum-data map, this imposes a limit of {math}`5000000` on number of |
| 260 | +`periods * (spectra+1) * (time channels+1)`. This means that scans with high numbers of points may fail, where those |
| 261 | +with fewer points will work. |
| 262 | + |
| 263 | +### Debug a failed fit |
| 264 | + |
| 265 | +Fitting will only ever start after at least as many points have been collected, as there are free parameters in the |
| 266 | +fitting model. For example, a guassian fit will not be attempted with less than 4 data points. This can occur if a |
| 267 | +scan is interrupted early, for example. |
| 268 | + |
| 269 | +Fits can also fail if the data is particularly poor. |
| 270 | + |
| 271 | +### Debug a bad fit |
| 272 | + |
| 273 | +Consider whether you can use a different fitting model. Some models are more prone to getting stuck in local minima than |
| 274 | +others. |
| 275 | + |
| 276 | +For peak-like data, `"com"` or `"max"` from bluesky's |
| 277 | +[`PeakStats callback`](https://blueskyproject.io/bluesky/main/callbacks.html#peakstats) can robustly give *an* answer, |
| 278 | +although often this will not be as good as a full fit and may not always be appropriate. |
| 279 | + |
| 280 | +### Debug keyboard-interrupt problems |
| 281 | + |
| 282 | +If a double ctrl-c does nothing, this is probably because the event loop is stalled - which is likely the result of a |
| 283 | +plan or device doing I/O or other synchronous operations directly, rather than via `asyncio` or `yield from`. |
| 284 | + |
| 285 | +Find the offending function and replace it with it's bluesky-native equivalent. Python's `asyncio` module has a debug |
| 286 | +mode which warns on long-running tasks - this can help identify the offending call. |
| 287 | + |
| 288 | +--- |
| 289 | + |
| 290 | +## What is this syntax ...? |
| 291 | + |
| 292 | +### `async def` / `await` / `asyncio` |
| 293 | + |
| 294 | +Functions declared as `async def` are [python **coroutines**](https://docs.python.org/3/library/asyncio-task.html#). |
| 295 | + |
| 296 | +In a bluesky context, if you see an `async def` function, it is likely that you are looking at either |
| 297 | +an `ophyd_async` device class, or a `RunEngine` message handler. |
| 298 | + |
| 299 | +In order to get the result of a coroutine, it needs to be `await`ed: |
| 300 | + |
| 301 | +```python |
| 302 | +async def foo(): |
| 303 | + result = await some_coroutine() |
| 304 | +``` |
| 305 | + |
| 306 | +In an `async def` function, do not perform blocking operations. For example, always use `await asyncio.sleep()`, |
| 307 | +and not `time.sleep()`. Similarly, do not use synchronous functions which perform I/O from a coroutine (e.g. calling |
| 308 | +`genie` APIs) - use the `async` APIs provided by `ophyd_async` instead. |
| 309 | + |
| 310 | +As a general rule of thumb, any function that could reasonably take more than a few tens of milliseconds, should not be |
| 311 | +called directly from a coroutine. |
| 312 | + |
| 313 | +### `yield` / `yield from` / `Generator` |
| 314 | + |
| 315 | +Functions containing `yield` and/or `yield from` are [python **generators**](https://wiki.python.org/moin/Generators). |
| 316 | + |
| 317 | +If you see `yield from` in a function, you are likely to be looking at a |
| 318 | +[bluesky plan](https://blueskyproject.io/bluesky/main/plans.html#). |
| 319 | + |
| 320 | +`yield from` is the syntax used to delegate to subplans: |
| 321 | + |
| 322 | +```python |
| 323 | +def some_plan(): |
| 324 | + yield from move(...) |
| 325 | + yield from scan(...) |
| 326 | +``` |
| 327 | + |
| 328 | +Will perform an initial move followed by a scan. |
| 329 | + |
| 330 | +Return values from plans can be captured: |
| 331 | + |
| 332 | +```python |
| 333 | +def plan_that_returns(): |
| 334 | + yield from something() |
| 335 | + return "hello, world" |
| 336 | + |
| 337 | +def plan(): |
| 338 | + returned_value = yield from plan_that_returns() |
| 339 | +``` |
| 340 | + |
| 341 | +Like `coroutines` above, long-running functions or functions which perform I/O should not be used in plans. Doing so |
| 342 | +can stall the `RunEngine`'s internal event loop, break keyboard-interrupts, and break rewindability. Use bluesky-native |
| 343 | +functionality instead - for example: |
| 344 | +- [`yield from bps.sleep()`](https://blueskyproject.io/bluesky/main/generated/bluesky.plan_stubs.sleep.html#bluesky.plan_stubs.sleep) instead of `time.sleep()` |
| 345 | +- [`yield from bps.mv(block_object, value)`](https://blueskyproject.io/bluesky/main/generated/bluesky.plan_stubs.mv.html#bluesky.plan_stubs.mv) instead of `g.cset(block_name, value)` |
| 346 | +- [`value = yield from bps.rd(block_object)`](https://blueskyproject.io/bluesky/main/generated/bluesky.plan_stubs.rd.html#bluesky.plan_stubs.rd) instead of `g.cget(block_name)["value"]` |
| 347 | + |
| 348 | +If you must call an external function which may be long running or do I/O from a plan, review |
| 349 | +[call_sync](../plan_stubs/external_code). |
0 commit comments