Skip to content

Conversation

pfebrer
Copy link
Contributor

@pfebrer pfebrer commented Apr 14, 2025

Here is the PR with the basis optimization CLI.

Probably I should add something to the documentation.

@zerothi could you test if it works fine for you? You just need to have the inputs for a calculation in a directory and then run:

stoolbox basis optim --geometry file.fdf

The only constraint is that the fdf file can't be named RUN.fdf.

EDIT: If you want to optimize with BADS (the default optimizer) you just need to install pybads:

pip install pybads

# This might be a problem if the arguments are not pickleable. E.g. the minimizer function
# is associated with the minimizer object, which might contain siles.
try:
opt = bads.optimize()

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable opt is not used.
# is associated with the minimizer object, which might contain siles.
try:
opt = bads.optimize()
except TypeError:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for try ... pass? Don't we want it to scream at the user if it fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is explained in the comment above, basically their final saving or whatever (I don't remember exactly what it was) uses pickle and it can fail, but I guess we didn't need it because we do our own saving of the optimization.

for pre, f in [(">", stdout), ("2>", stderr)]:
try:
pipe += f"{pre} {f.name}"
except Exception:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.
cmd[:-2],
cwd=self.path,
encoding="utf-8",
stdin=open(self.fdf, "r"),

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
# Retrieve delta-x for the jacobian for this
eps = minimize.normalize("delta", with_offset=False)

result = minimize.run(

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'result' is unnecessary as it is
redefined
before this value is used.
This assignment to 'result' is unnecessary as it is
redefined
before this value is used.
**optimizer_kwargs,
)
elif optimizer == "swarms":
result = minimize.run(**optimizer_kwargs)

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'result' is unnecessary as it is
redefined
before this value is used.
This assignment to 'result' is unnecessary as it is
redefined
before this value is used.
elif optimizer == "swarms":
result = minimize.run(**optimizer_kwargs)
else:
result = minimize.run(

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'result' is unnecessary as it is
redefined
before this value is used.
This assignment to 'result' is unnecessary as it is
redefined
before this value is used.
title = "Plotting facility for atom output (run in the atom output directory)"
if is_sub:
global _script
_script = f"{_script} atom-plot"

Check notice

Code scanning / CodeQL

Unused global variable Note

The global variable '_script' is not used.
Copy link
Owner

@zerothi zerothi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll play with it at its current state, but some comments here and there would be beneficial to be cleared out!

self.data.hash.append(current_hash)
metric = self(variables, *args)

if metric is not None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would metric be none? Would it even be viable?

Shouldn't something break if the metric is none?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think None meant that it has not been able to compute the metric, e.g. the calculation has failed. It shouldn't break because the optimizer can set variable values that result in a failed calculation.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the way the old thing was working is that if that metric broke, it had to return some too high number. Otherwise some of the minimizations couldn't cope, i.e. it calculates something, expects a metric to be there, but doesn't find one, how should it then select an appropriate next point. SWARM, and others might be able too, but those that have deterministic decisions won't, right?

So, I guess a metric that fails, should scream, did you encounter this?

Copy link
Contributor Author

@pfebrer pfebrer Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I recall having tested with all the minimizers and they all handled a None, but I can't say 100%

bounds = self.normalize_bounds()
bounds = np.array(bounds)

bounds[bounds == 0] = 0.5
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this?

What happens when one has a bounds [0, 0.2] then it would be changed to [0.5, 0.2]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess, but who has 0.2 as the maximum cutoff for the basis? I agree that this is not general, And I don't remember what happened if one of the bounds was 0, maybe something nasty 😅

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if you want 0.5 to be the minimum, then we should have bounds[bounds < 0.5] = 0.5, right?

Otherwise we should select a sensible minimum, and let users control everything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes clearly something should be done if there is someone who wants to use this for something other than basis optimization.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just read through my old yaml conf, we have to remove this, because there are soft-confinement things that uses negative bounds, so you would have to know which kind of boundary it attaches to...

This check was only done for the bads, but in principle, it should be there for all, so I'll remove it...

It uses the pybads package to perform the minimization.
"""

def run(self, options={}, get_hard_bounds: Callable = lambda x: x, **kwargs):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename get_hard_bounds to func_conv_bounds_to_hard or something more descriptive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever you prefer

# is associated with the minimizer object, which might contain siles.
try:
opt = bads.optimize()
except TypeError:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for try ... pass? Don't we want it to scream at the user if it fails?

@@ -0,0 +1,1031 @@
# This Source Code Form is subject to the terms of the Mozilla Public
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much of this file is duplicating the yaml conf file? Seems to have some overlap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to answer this 😅

@zerothi
Copy link
Owner

zerothi commented May 13, 2025

Seems like we should move the basis optim writer to the _yaml_reader and rename it.

get_argparse_parser(
write_basis_to_yaml, name="build", subp=subp, parser_kwargs=parser_kwargs
)
parser_kwargs["aliases"] = ("optim",)

Check failure

Code scanning / CodeQL

Modification of parameter with default Error

This expression mutates a
default value
.
Signed-off-by: Nick Papior <[email protected]>
@zerothi
Copy link
Owner

zerothi commented Jun 19, 2025

@pfebrer could you please review my comments, then we should get it in asap (note I forced pushed a rebase!)

@pfebrer
Copy link
Contributor Author

pfebrer commented Jun 19, 2025

Hey, I have answered them, sorry I think I was too busy when I saw the review

Copy link

codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 22.09302% with 67 lines in your changes missing coverage. Please review.

Project coverage is 86.82%. Comparing base (58d60d8) to head (ec336e9).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/sisl/_lib/_argparse.py 22.61% 65 Missing ⚠️
src/sisl/_core/periodictable.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #910      +/-   ##
==========================================
- Coverage   86.92%   86.82%   -0.10%     
==========================================
  Files         412      412              
  Lines       54332    54413      +81     
==========================================
+ Hits        47227    47244      +17     
- Misses       7105     7169      +64     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

get_argparse_parser(
write_basis_to_yaml, name="build", subp=subp, parser_kwargs=parser_kwargs
optimize_basis, name="optimize", subp=subp, parser_kwargs=parser_kwargs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update the documentation too then, at docs/toolbox/basis/basis.rst

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is optim also valid?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both are valid, but yes, I should have done that (will do!)

@zerothi
Copy link
Owner

zerothi commented Jun 20, 2025

I have some questions, the charge-confinement scheme is not only targeted polarization orbitals. AFAIK, it's use is for non-populated orbitals, and so could be either regular empty orbitals or something else.

I will also add the soft-confinement to play with. Do you have any comments on this before I make changes?


Appropiate optimization bounds are set for each parameter.\n\n

At each step, all corresponding parameters are optimized simultaneously.\n\n
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you clarify for me here, I tried reading it twice, but I don't understand the flow.

It first reads, it will optimize them sequentially. Then it says that all parameters are optimized simultaneously?
What am I not understanding?

Copy link
Contributor Author

@pfebrer pfebrer Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simultaneously means that for example when optimizing the first zeta shell, all first zeta cutoff radii (for all atoms) are optimized at the same time. And same for the polarization orbitals

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but not that they would all cycle the same numbers, right?

Added needed dependency for pyyaml.
While strictly not necessary, I think this is
an ok thing. It is very much used elsewhere, and
we might require it for other things.

Made an alias for stoolbox basis optimize|optim.

Fixed a thing for return numpy arrays in periodictable.

The new CLI registry requires handling of union types, this
will fix it by converting unions to str, regardless of what
it actually is.

Added information when basis information is requested but
the entries won't be there due to too small numbers.
E.g. soft confinement with V0 = 0 will effectively not do
anything. Same for charge confinement for abs(Z) = 0.

Added soft-confinement. Made charge-confinement available for
all orbitals with 0 charge. Soft-confinement are added
to all orbitals, which likely is overkill. It should probably
only be done for narrow orbitals. Well...

When *guessing* the orbitals I found some errors, at least
for gold, it simply omitted the p orbitals, so now there
is a warning, and then it will continue silently.
It's better to rely on user-defined basis-information.

Enabled reading psml files from SIESTA_PS_PATH (not psf!).

Signed-off-by: Nick Papior <[email protected]>
@zerothi
Copy link
Owner

zerothi commented Jun 20, 2025

@pfebrer could you have a look at the latest commit I made, I added soft-confinement, and changed the logic of the charge-confinement (now for q=0 orbitals + polarization orbitals).

Plus some other minor details.

if basis_size != "atoms":
try:
n_zetas = int(basis_size[0])
except:

Check notice

Code scanning / CodeQL

Except block handles 'BaseException' Note

Except block directly handles BaseException.

# Loop through atoms and generate a basis config for each, adding it to config_dict.
for atom in atoms:
table_row = PT.Z_row(abs(atom.Z))

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'table_row' is unnecessary as it is
redefined
before this value is used.
# Loop through atoms and generate a basis config for each, adding it to config_dict.
for atom in atoms:
table_row = PT.Z_row(abs(atom.Z))
atom_block = PT.Z_block(abs(atom.Z))

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'atom_block' is unnecessary as it is
redefined
before this value is used.
@pfebrer
Copy link
Contributor Author

pfebrer commented Jun 20, 2025

Regarding the charge confinement and soft-confinement, the basis optimization follows a workflow that Federico determined was general and good enough to optimize the basis in "any" system. So it was not meant to be very flexible in what you can optimize, but that was more a feature than a bug I would say. If you want to change something about the workflow, I would talk to Federico

warn(
"The orbitals are pre-filled, but likely they are not correct. "
"Please check the basis before doing excessive optimizations!"
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why likely they are not correct? 😅

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this on gold, and it gave me s, d and f shells, no p... And I wouldn't like to run through them all, so better be safe than sorry ;)

If true, it will optimize, the potential (:math:`V_0`), and the inner
radius :math:`r_i`.

For multiple arguments, each can be fine-tuned on/off.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work from the CLI? If so, how?

Doesn't it make the CLI harder to understand?
--optimize-soft-conf false will optimize soft confinement.

For the charge confinement, are you sure optimizing the three parameters at the same time will result in a good optimization/ is worth it? I also don't understand to which orbitals it applies, Federico said that the charge optimization should only be done for polarization orbitals. So again make sure to talk to Federico because I am not sure that this added flexibility is good.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works by --opt true|true,false,false e.g.
If you say false, it should not run it, but perhaps I forgot to do the proper conversion here... I just did bool on arguments, which when I think about it is wrong. ;)

I'll talk with fede.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants