Skip to content

Base Optimizer

Wrapper around Xopt for automated parameter optimization.

This class provides a simplified interface to the Xopt optimization library, designed specifically for integration with the GEECS scanner system. It handles the generation of candidate parameter sets and evaluation of objective functions while maintaining separation from control system logic.

The optimizer supports various optimization algorithms and evaluation modes, with built-in integration for experimental data acquisition and logging.

Parameters:

Name Type Description Default
vocs VOCS

Variables, Objectives, and Constraints Specification defining the optimization problem structure.

required
evaluate_function callable

Function that takes a dictionary of variable values and returns a dictionary of objective and constraint results.

required
generator_name str

Name of the Xopt generator algorithm to use (e.g., 'random', 'cnsga', 'upper_confidence_bound').

required
xopt_config_overrides dict

Dictionary to override default Xopt configuration parameters.

None
evaluator BaseEvaluator

Reference to the evaluator object providing the evaluate_function.

None
device_requirements dict

Dictionary defining required devices and variables for optimization.

None
scan_data_manager ScanDataManager

Manager instance for accessing saved non-scalar data.

None
data_logger DataLogger

Logger instance for accessing shot data and bin information.

None
seed_dump_files list of Path

Paths to prior Xopt dump YAML files whose evaluated data will be loaded into the optimizer before the scan begins. VOCS must be compatible (same variable and objective names); differing bounds produce warnings only.

None
move_to_best_on_finish bool

If True, best_observed_setpoint() is called at scan end so the engine can move devices to the empirically best configuration. Falls back to initial-state restoration if no usable rows exist.

False

Attributes:

Name Type Description
vocs VOCS

The optimization problem specification.

evaluate_function callable

The objective function evaluator.

generator_name str

Name of the optimization algorithm being used.

evaluator BaseEvaluator or None

Reference to the evaluator instance.

device_requirements dict

Required devices and variables configuration.

xopt Xopt or None

The underlying Xopt optimizer instance.

scan_data_manager ScanDataManager or None

Data manager for accessing scan data.

data_logger DataLogger or None

Logger for accessing experimental data.

Methods:

Name Description
initialize

Run initial random evaluations to seed the optimization.

generate

Generate candidate parameter sets for evaluation.

evaluate

Evaluate candidate points and store results.

get_results

Return the complete optimization results.

get_best

Return the best observed parameter set.

seed_from_dumps

Load historical data from dump files into the optimizer.

from_config_file

Create optimizer instance from YAML configuration file.

Source code in geecs_scanner/optimization/base_optimizer.py
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
def __init__(
    self,
    vocs: VOCS,
    evaluate_function: Callable[[Dict[str, Any]], Dict[str, Any]],
    generator_name: str,
    xopt_config_overrides: Optional[dict] = None,
    evaluator: Optional[BaseEvaluator] = None,
    device_requirements: Optional[Dict[str, Any]] = None,
    scan_data_manager: Optional["ScanDataManager"] = None,
    data_logger: Optional["DataLogger"] = None,
    seed_dump_files: Optional[List[Path]] = None,
    move_to_best_on_finish: bool = False,
):
    self.vocs = vocs
    self.evaluate_function = evaluate_function
    self.generator_name = generator_name
    self.evaluator = evaluator
    self.device_requirements = device_requirements or {}
    self.xopt: Optional[Xopt] = None
    self.scan_data_manager = scan_data_manager
    self.data_logger = data_logger
    self._n_seeded: int = 0
    self.move_to_best_on_finish: bool = move_to_best_on_finish

    self.xopt_config_overrides: dict[str, Any] = dict(xopt_config_overrides or {})
    self._setup_xopt(self.xopt_config_overrides)

    if seed_dump_files:
        self.seed_from_dumps(seed_dump_files)

Attributes

vocs instance-attribute

vocs = vocs

evaluate_function instance-attribute

evaluate_function = evaluate_function

generator_name instance-attribute

generator_name = generator_name

evaluator instance-attribute

evaluator = evaluator

device_requirements instance-attribute

device_requirements = device_requirements or {}

xopt instance-attribute

xopt: Optional[Xopt] = None

scan_data_manager instance-attribute

scan_data_manager = scan_data_manager

data_logger instance-attribute

data_logger = data_logger

move_to_best_on_finish instance-attribute

move_to_best_on_finish: bool = move_to_best_on_finish

xopt_config_overrides instance-attribute

xopt_config_overrides: dict[str, Any] = dict(xopt_config_overrides or {})

n_seeded property

n_seeded: int

Number of evaluations loaded from dump files before the scan started.

Functions

best_observed_setpoint

best_observed_setpoint() -> Optional[Dict[str, float]]

Return the VOCS-variable values of the best-observed row in X.data.

Returns:

Type Description
dict or None

{variable_name: value} for the row with the best objective. None if X.data is empty, uninitialized, or all rows are errored / have a NaN objective.

Source code in geecs_scanner/optimization/base_optimizer.py
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
def best_observed_setpoint(self) -> Optional[Dict[str, float]]:
    """Return the VOCS-variable values of the best-observed row in X.data.

    Returns
    -------
    dict or None
        ``{variable_name: value}`` for the row with the best objective.
        None if X.data is empty, uninitialized, or all rows are errored /
        have a NaN objective.
    """
    if self.xopt is None or self.xopt.data is None or len(self.xopt.data) == 0:
        return None

    df = self.xopt.data.copy()
    obj = self.vocs.objective_names[0]

    if "xopt_error" in df.columns:
        df = df[df["xopt_error"] != True]  # noqa: E712
    df = df[df[obj].notna()]

    if len(df) == 0:
        return None

    direction = str(self.vocs.objectives[obj]).upper()
    idx = df[obj].idxmax() if direction == "MAXIMIZE" else df[obj].idxmin()

    return {name: float(df.loc[idx, name]) for name in self.vocs.variable_names}

seed_from_dumps

seed_from_dumps(dump_paths: List[Path]) -> int

Load historical data from prior Xopt dump files.

Each file's VOCS is checked for compatibility with this optimizer's VOCS before any data is loaded. Rows where xopt_error is True or any objective column is NaN are filtered out.

Parameters:

Name Type Description Default
dump_paths List[Path]

Paths to xopt_dump.yaml files written by Xopt.dump().

required

Returns:

Type Description
int

Total number of rows added to the optimizer (after filtering).

Source code in geecs_scanner/optimization/base_optimizer.py
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
def seed_from_dumps(self, dump_paths: List[Path]) -> int:
    """Load historical data from prior Xopt dump files.

    Each file's VOCS is checked for compatibility with this optimizer's
    VOCS before any data is loaded.  Rows where ``xopt_error`` is True or
    any objective column is NaN are filtered out.

    Parameters
    ----------
    dump_paths:
        Paths to ``xopt_dump.yaml`` files written by ``Xopt.dump()``.

    Returns
    -------
    int
        Total number of rows added to the optimizer (after filtering).
    """
    objective_names = list(self.vocs.objectives.keys())
    all_frames: List[pd.DataFrame] = []
    dump_vocs_pairs = []

    for path in dump_paths:
        path = Path(path)
        if not path.exists():
            logger.warning("Seed dump file not found, skipping: %s", path)
            continue

        try:
            source_vocs, df = load_xopt_dump(path)
        except (KeyError, Exception) as exc:
            logger.warning("Failed to parse dump file %s: %s", path, exc)
            continue

        try:
            check_vocs_compatible(self.vocs, source_vocs, path)
        except ValueError as exc:
            logger.error("Incompatible dump file, skipping: %s", exc)
            continue

        dump_vocs_pairs.append((path, source_vocs))

        # Filter error rows
        if "xopt_error" in df.columns:
            n_before = len(df)
            df = df[df["xopt_error"] != True]  # noqa: E712
            n_errors = n_before - len(df)
            if n_errors:
                logger.info("Filtered %d error row(s) from %s", n_errors, path.name)

        # Filter NaN objective rows
        for obj_name in objective_names:
            if obj_name in df.columns:
                n_before = len(df)
                df = df[df[obj_name].notna()]
                n_nan = n_before - len(df)
                if n_nan:
                    logger.info(
                        "Filtered %d NaN '%s' row(s) from %s",
                        n_nan,
                        obj_name,
                        path.name,
                    )

        if df.empty:
            logger.warning(
                "No valid data in %s after filtering; skipping.", path.name
            )
            continue

        all_frames.append(df)
        logger.info("Loaded %d evaluations from %s", len(df), path.name)

    if not all_frames:
        logger.warning("No valid seed data loaded from any dump file.")
        return 0

    # Cross-dump VOCS bounds consistency (log only)
    if len(dump_vocs_pairs) > 1:
        check_cross_dump_consistency(dump_vocs_pairs)

    combined = pd.concat(all_frames, ignore_index=True)

    _warn_on_duplicate_inputs(combined, self.vocs)

    # Populate xopt.data and propagate to the generator.
    # Xopt.add_data calls generator.add_data internally in recent versions,
    # but the explicit call guards against older versions where it did not.
    self.xopt.add_data(combined)
    if self.xopt.generator.data is None or len(self.xopt.generator.data) == 0:
        self.xopt.generator.add_data(combined)

    self._n_seeded = len(combined)
    logger.info(
        "Seeded optimizer with %d total evaluation(s) from %d dump file(s).",
        self._n_seeded,
        len(dump_paths),
    )
    return self._n_seeded

initialize

initialize(num_initial: int = 1)

Run initial random evaluations to seed the optimization.

Performs random sampling of the parameter space to provide initial data points for the optimization algorithm. This is particularly important for algorithms that require historical data to function effectively (e.g., Bayesian optimization).

Parameters:

Name Type Description Default
num_initial int

Number of random evaluations to perform for initialization.

1
Source code in geecs_scanner/optimization/base_optimizer.py
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
def initialize(self, num_initial: int = 1):
    """
    Run initial random evaluations to seed the optimization.

    Performs random sampling of the parameter space to provide initial
    data points for the optimization algorithm. This is particularly
    important for algorithms that require historical data to function
    effectively (e.g., Bayesian optimization).

    Parameters
    ----------
    num_initial : int, default=1
        Number of random evaluations to perform for initialization.
    """
    self.xopt.random_evaluate(num_initial)

generate

generate(n: int = 1) -> List[dict]

Generate candidate parameter sets for evaluation.

Uses the configured optimization algorithm to propose new parameter combinations that are likely to improve the objective function based on previously evaluated points.

Parameters:

Name Type Description Default
n int

Number of candidate parameter sets to generate.

1

Returns:

Type Description
list of dict

List of parameter dictionaries, each representing a set of control variable values to be evaluated.

Source code in geecs_scanner/optimization/base_optimizer.py
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
def generate(self, n: int = 1) -> List[dict]:
    """
    Generate candidate parameter sets for evaluation.

    Uses the configured optimization algorithm to propose new parameter
    combinations that are likely to improve the objective function based
    on previously evaluated points.

    Parameters
    ----------
    n : int, default=1
        Number of candidate parameter sets to generate.

    Returns
    -------
    list of dict
        List of parameter dictionaries, each representing a set of
        control variable values to be evaluated.
    """
    return self.xopt.generator.generate(n)

evaluate

evaluate(inputs: List[dict])

Evaluate candidate parameter sets and store results.

Evaluates the provided parameter sets using the configured evaluation function and stores the results in the optimization history for use by future generation steps.

Parameters:

Name Type Description Default
inputs list of dict

List of parameter dictionaries to evaluate, typically generated by the generate() method.

required
Source code in geecs_scanner/optimization/base_optimizer.py
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def evaluate(self, inputs: List[dict]):
    """
    Evaluate candidate parameter sets and store results.

    Evaluates the provided parameter sets using the configured evaluation
    function and stores the results in the optimization history for use
    by future generation steps.

    Parameters
    ----------
    inputs : list of dict
        List of parameter dictionaries to evaluate, typically generated
        by the `generate()` method.
    """
    self.xopt.evaluate_data(inputs)

    # If the generator provides diagnostic metadata (e.g., BAX), log it
    metadata: Dict[str, float] = {}
    generator = getattr(self.xopt, "generator", None)
    algo_results = getattr(generator, "algorithm_results", None)

    if isinstance(algo_results, dict):
        center = algo_results.get("solution_center")
        if center is not None:
            try:
                center_values = list(center)
            except TypeError:
                center_values = [center]

            for name, value in zip(self.vocs.variable_names, center_values):
                try:
                    metadata[f"BAX_solution_center[{name}]"] = float(value)
                except (TypeError, ValueError):
                    continue

        entropy = algo_results.get("solution_entropy")
        if entropy is not None:
            try:
                metadata["BAX_solution_entropy"] = float(entropy)
            except (TypeError, ValueError):
                pass

    if metadata and self.evaluator is not None:
        self.evaluator.log_results_for_current_bin(metadata)

get_results

get_results()

Return complete optimization results.

Retrieves the full DataFrame containing all evaluated parameter sets and their corresponding objective and constraint values.

Returns:

Type Description
DataFrame

Complete results DataFrame with columns for all variables, objectives, and constraints that have been evaluated.

Source code in geecs_scanner/optimization/base_optimizer.py
396
397
398
399
400
401
402
403
404
405
406
407
408
409
def get_results(self):
    """
    Return complete optimization results.

    Retrieves the full DataFrame containing all evaluated parameter
    sets and their corresponding objective and constraint values.

    Returns
    -------
    pandas.DataFrame
        Complete results DataFrame with columns for all variables,
        objectives, and constraints that have been evaluated.
    """
    return self.xopt.data

get_best

get_best()

Return the best observed parameter set.

Identifies and returns the parameter combination that achieved the best objective function value according to the optimization criteria (minimize or maximize).

Returns:

Type Description
DataFrame

Single-row DataFrame containing the best parameter set and its corresponding objective and constraint values.

Source code in geecs_scanner/optimization/base_optimizer.py
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
def get_best(self):
    """
    Return the best observed parameter set.

    Identifies and returns the parameter combination that achieved
    the best objective function value according to the optimization
    criteria (minimize or maximize).

    Returns
    -------
    pandas.DataFrame
        Single-row DataFrame containing the best parameter set and
        its corresponding objective and constraint values.

    """
    return self.xopt.data.sort_values(by=list(self.vocs.objectives.keys()))[:1]

from_config_file classmethod

from_config_file(config_path: str, scan_data_manager: Optional['ScanDataManager'] = None, data_logger: Optional['DataLogger'] = None) -> 'BaseOptimizer'

Create optimizer instance from YAML configuration file.

Loads optimizer configuration, evaluator settings, and VOCS specification from a YAML file and creates a fully configured BaseOptimizer instance. This method provides a convenient way to set up complex optimization problems without manual instantiation.

The evaluator class is dynamically imported based on the module and class name specified in the configuration file.

Parameters:

Name Type Description Default
config_path str

Path to the YAML configuration file containing optimizer settings.

required
scan_data_manager ScanDataManager

Instance of ScanDataManager for accessing data during acquisition.

None
data_logger DataLogger

Instance of DataLogger for accessing shot data and bin information.

None

Returns:

Type Description
BaseOptimizer

Fully configured optimizer instance ready for use.

Source code in geecs_scanner/optimization/base_optimizer.py
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
@classmethod
def from_config_file(
    cls,
    config_path: str,
    scan_data_manager: Optional["ScanDataManager"] = None,
    data_logger: Optional["DataLogger"] = None,
) -> "BaseOptimizer":
    """
    Create optimizer instance from YAML configuration file.

    Loads optimizer configuration, evaluator settings, and VOCS specification
    from a YAML file and creates a fully configured BaseOptimizer instance.
    This method provides a convenient way to set up complex optimization
    problems without manual instantiation.


    The evaluator class is dynamically imported based on the module
    and class name specified in the configuration file.

    Parameters
    ----------
    config_path : str
        Path to the YAML configuration file containing optimizer settings.
    scan_data_manager : ScanDataManager, optional
        Instance of ScanDataManager for accessing data during acquisition.
    data_logger : DataLogger, optional
        Instance of DataLogger for accessing shot data and bin information.

    Returns
    -------
    BaseOptimizer
        Fully configured optimizer instance ready for use.
    """
    import importlib
    from geecs_scanner.optimization.config_models import BaseOptimizerConfig

    # Load and validate config using Pydantic model
    with open(config_path, "r") as f:
        config_dict = yaml.safe_load(f)

    # This handles all validation AND auto-generates device_requirements!
    config = BaseOptimizerConfig.model_validate(config_dict)

    # Dynamically import and instantiate evaluator
    evaluator_init_kwargs = config.evaluator.kwargs.copy()
    evaluator_init_kwargs["device_requirements"] = config.device_requirements
    if scan_data_manager:
        evaluator_init_kwargs["scan_data_manager"] = scan_data_manager
    if data_logger:
        evaluator_init_kwargs["data_logger"] = data_logger

    module = importlib.import_module(config.evaluator.module)
    evaluator_class = getattr(module, config.evaluator.class_)
    evaluator = evaluator_class(**evaluator_init_kwargs)

    # Prepare generator overrides, ensuring any relative output paths are rooted in the scan folder
    overrides = dict(config.xopt_config_overrides)
    scan_folder = None
    if scan_data_manager:
        try:
            scan_folder = scan_data_manager.scan_paths.get_folder()
        except AttributeError:
            scan_folder = None
        else:
            scan_folder = Path(scan_folder)
            scan_folder.mkdir(parents=True, exist_ok=True)

            for key, block in list(overrides.items()):
                if not isinstance(block, dict):
                    continue

                file_value = block.get("algorithm_results_file")
                if file_value is None:
                    # Provide a sensible default within the scan directory
                    block["algorithm_results_file"] = str(
                        scan_folder / f"{key}_algo_results"
                    )
                else:
                    path = Path(file_value)
                    if not path.is_absolute():
                        block["algorithm_results_file"] = str(
                            (scan_folder / path).resolve()
                        )

    # Resolve seed_dump_files paths relative to the config file's directory
    resolved_seed_paths: Optional[List[Path]] = None
    if config.seed_dump_files:
        config_dir = Path(config_path).parent
        resolved_seed_paths = []
        for raw in config.seed_dump_files:
            p = Path(raw)
            if not p.is_absolute():
                p = (config_dir / p).resolve()
            if not p.exists():
                logger.warning(
                    "seed_dump_files entry not found (will be skipped): %s", p
                )
            resolved_seed_paths.append(p)

    # Create optimizer using validated config
    return cls(
        vocs=config.vocs,
        evaluate_function=evaluator.get_value,
        generator_name=config.generator.name,
        xopt_config_overrides=overrides,
        evaluator=evaluator,
        device_requirements=config.device_requirements,
        scan_data_manager=scan_data_manager,
        data_logger=data_logger,
        seed_dump_files=resolved_seed_paths,
        move_to_best_on_finish=config.move_to_best_on_finish,
    )