Process Internals¶
Internal data and files¶
When platform migrator is installed, it creates an internal directory,
.platform_migrator in the home directory of the user. This directory is
stores all data related to the migrations attempted and also acts as the
initial working directory for the process and the server. Platform migrator
does not put any internal data outside this directory unless instructed to
do so otherwise.
Starting the server¶
When the command platform-migrator server start is executed, the
main module, switches the working directory to
.platform_migrator and executes the script
server as subprocess and exits.
The server script contains
MigrateRequestHandler and
PMServer classes, which are the request
handler and the HTTP server respectively. The request handler implements a
do_GET() method,
which listens for get requests on the /migrate route and a
do_POST(), which
listens for any incoming data on /yml, /min and /zip routes. The
/yml route is to receive the conda environment YAML file from the base
system, /min for the minimal dependencies identified, if any, and /zip
to receive a zip of the software itself.
All three POST routes take a JSON input with two attributes, name and
data, where name contains the name of the software being migrated and
data contains base64 encoded binary data corresponding to the what the
route takes.
On startup, the server first creates a PIDFILE in the working directory, which
is used to track which PID the server is running as, and also creates a copy of
base_sys_script with the HOST and PORT
variables updated to the value under which the server is running. This script
is returned as the response when /migrate a request is received on
/migrate route. The server now waits for requests.
The server will fail to start in case a PIDFILE already exists. So, only one instance of the server can be running at any point in time. When the server is stopped, it deletes the PIDFILE and the script before exiting.
Probing the base system¶
On the base system, platform migrator probes the conda environment using the
functions in base_sys_script.
Receiving the probing script¶
This section is only run when transferring using the server. If transferring
using a zip file, the script is executed using platform-migrator pack
command.
The script gets downloaded to the base system when a request is made to the
platform migrator server on the /migrate path. See the Tutorials for
how to make the request using curl or Python. The script is compatible with
both Python 2 and 3 specifically to allow it to run on older systems that may
not be upgraded to latest version of Python.
When the script is executed, it first tries to identify which conda
environment is active and uses that or prompts the user to enter the name of
a conda environment and tries to source that before. In case the user input
is required, it assumes that the activate shell script provided by conda
is in the current working directory.
Handling transitive dependencies¶
Once the conda environment is identified, the script will run the
get_conda_min_deps() function to
identify a closure of conda packages that can be used to minimize the number
of installs on the target system.
First, all the packages installed in the conda environment are obtained using
conda list command. Then, for each dependency, a dry run of creating a new
environment with that dependency as a target is done. This gives a list of
packages that are required by the dependency. These packages are a second level
dependency to the software being migrated and are removed from the list of
direct dependencies of the software. Once all the dependencies have been
processed, the remaining ones are the direct requirements for the software.
This list is transferred over to the target system as well and only the
packages in this list are installed on the target system. Their transitive
dependencies are automatically installed by the package manager on the target
system.
Collecting info¶
It then prompts the user to enter the name of the software and the directory where it is saved. The conda environment data, direct dependency list and the software are zipped up and sent back to the target system using the POST routes of the server. The script then exits.
Running a migration¶
Once the data from the base system has been transferred over, any number of
migration attempts can be made on it using platform migrator. All data gets
saved in ~/.platform_migrator/<software-name>/ directory on the target
system. When platform-migrator migrate <software-name> <test-config> is
executed, the main script parses the command line
arguments and passes them over to the
migrate() function, which is works as a
wrapper to control the Migrator class.
The Migrator class parses the test
config files and creates a new migration id for the job. This migration id is
used to create a new directory for the migration and allows users to perform
multiple attempts for the same software. In future, this may also store
metadata about the attempted migration.
With conda as a package manager¶
Next, the conda environment YAML file is parsed and conda internal packages are removed. Now, if the test configuration lists conda as one of the package managers available, platform migrator will just use conda and ignore all other package managers. A new conda environment with the same name as the software is created and the YAML file is used to install the packages in it.
Now, the software is unzipped inside the migration directory and the tests configured for the software are run inside a sub-shell with the new conda environment activated. If multiple tests are configured, each test is run in a separate sub-shell. All tests are run even if one of the test fails. However, the migration is marked as an failure if any of the tests fail.
Once the migration is complete, the unzipped software is deleted. If the tests were successful, the software is unzipped again in the output directory from the test configuration.
With external package managers¶
If conda is not one of the package managers listed in the test configuration,
get_package_manager() is used to
parse the package manager configuration files. The function is a factory
function for creating
:py:class`~platform_migrator.package_manager.PackageManager` objects. Once the
package managers are obtained, each package from the direct dependency list
of the software is searched for in the package managers and the user is prompted
to confirm which of the search result should be used. If the user does not
install any of the offered packages from any of the package managers, an option
to abort the migration is offered.
The searches try to offer the user with as few options as possible by removing by using stricter search criteria first and only using the relaxed criteria if there are no results returned. As soon as a search returns results, they are presented to the user for selection.
Once all the packages have been selected by the user, they are installed one by one. This is intentionally done so that even if one package fails to install, other packages can still be installed and tests can be run.
The tests here are run similar to the conda case, except that there is no conda environment which needs to be activated. Other than that, the same process is used for running the tests.