Models¶

These models are used for argument and return value validation. They are based on the pydantic package.

Definitions¶

class avatars.models.ApiKey[source]¶

Response model for an API key.

created_at: Annotated[AwareDatetime, Field(title='Created At')] [Required]¶

expires_at: Annotated[AwareDatetime, Field(title='Expires At')] [Required]¶

id: Annotated[UUID, Field(title='Id')] [Required]¶

last_used_at: Annotated[AwareDatetime | None, Field(title='Last Used At')] = None¶

name: Annotated[str, Field(title='Name')] [Required]¶

revoked_at: Annotated[AwareDatetime | None, Field(title='Revoked At')] = None¶

class avatars.models.ApiKeyWithPlaintext[source]¶

API Key response model that includes the secret for creation.

created_at: Annotated[AwareDatetime, Field(title='Created At')] [Required]¶

expires_at: Annotated[AwareDatetime, Field(title='Expires At')] [Required]¶

id: Annotated[UUID, Field(title='Id')] [Required]¶

last_used_at: Annotated[AwareDatetime | None, Field(title='Last Used At')] = None¶

name: Annotated[str, Field(title='Name')] [Required]¶

plaintext: Annotated[str, Field(title='Plaintext')] [Required]¶

revoked_at: Annotated[AwareDatetime | None, Field(title='Revoked At')] = None¶

class avatars.models.BulkDeleteRequest[source]¶

Request model for bulk job deletion.

job_names: Annotated[list[str], Field(max_length=100, title='Job Names')] [Required]¶

Constraints:

max_length = 100

class avatars.models.CompatibilityStatus(*values)[source]¶

compatible = 'compatible'¶

incompatible = 'incompatible'¶

unknown = 'unknown'¶

class avatars.models.ExpirationDays(*values)[source]¶

Expiration preset in days (choose from 30/60/120/365/3650)

integer_30 = 30¶

integer_60 = 60¶

integer_120 = 120¶

integer_365 = 365¶

integer_3650 = 3650¶

class avatars.models.CreateApiKeyRequest[source]¶

Request body for creating an API key.

expiration_days: Annotated[ExpirationDays, Field(description='Expiration preset in days (choose from 30/60/120/365/3650)', title='Expiration Days')] [Required]¶: Expiration preset in days (choose from 30/60/120/365/3650)

name: Annotated[str, Field(description='Human-readable label for the API key', max_length=255, min_length=1, title='Name')] [Required]¶

Human-readable label for the API key

Constraints:

min_length = 1
max_length = 255

class avatars.models.CreateApiKeyResponse[source]¶

Response for API key creation.

api_key: ApiKeyWithPlaintext [Required]¶

message: Annotated[str, Field(title='Message')] [Required]¶

class avatars.models.CreditsInfo[source]¶

credits: Annotated[int | None, Field(title='Credits')] [Required]¶

is_credit_enabled: Annotated[bool, Field(title='Is Credit Enabled')] [Required]¶

class avatars.models.EnvironmentInfo[source]¶

Resolved environment values for the current user.

dataset_expiration_days: Annotated[int, Field(description='Number of days before a dataset expires.', title='Dataset Expiration Days')] [Required]¶: Number of days before a dataset expires.

max_allowed_dimensions_per_dataset: Annotated[int, Field(description='Maximum number of dimensions (columns) allowed per dataset.', title='Max Allowed Dimensions Per Dataset')] [Required]¶: Maximum number of dimensions (columns) allowed per dataset.

max_allowed_lines_per_dataset: Annotated[int, Field(description='Maximum number of rows allowed per dataset.', title='Max Allowed Lines Per Dataset')] [Required]¶: Maximum number of rows allowed per dataset.

class avatars.models.EventLogResponse[source]¶

A single audit-trail entry visible to the user.

created_at: Annotated[AwareDatetime, Field(title='Created At')] [Required]¶

id: Annotated[UUID, Field(title='Id')] [Required]¶

object_id: Annotated[UUID | None, Field(title='Object Id')] = None¶

object_type: Annotated[str, Field(title='Object Type')] [Required]¶

verb: Annotated[str, Field(title='Verb')] [Required]¶

class avatars.models.FeatureScope(*values)[source]¶

avatar_parameters = 'avatar_parameters'¶

single_table = 'single_table'¶

multi_table = 'multi_table'¶

time_series = 'time_series'¶

report = 'report'¶

geolocalization = 'geolocalization'¶

privacy_assessment = 'privacy_assessment'¶

differential_privacy = 'differential_privacy'¶

class avatars.models.FeaturesInfo[source]¶

feature_scopes: Annotated[list[FeatureScope], Field(title='Feature Scopes')] [Required]¶

class avatars.models.FileCredentials[source]¶

access_key_id: Annotated[str, Field(title='Access Key Id')] [Required]¶

secret_access_key: Annotated[str, Field(title='Secret Access Key')] [Required]¶

class avatars.models.ForgottenPasswordRequest[source]¶

email: Annotated[str, Field(title='Email')] [Required]¶

class avatars.models.JobCreateRequest[source]¶

depends_on: Annotated[list[str] | None, Field(title='Depends On')] = []¶

parameters_name: Annotated[str, Field(title='Parameters Name')] [Required]¶

set_name: Annotated[UUID, Field(title='Set Name')] [Required]¶

class avatars.models.JobCreateResponse[source]¶

Location: Annotated[str, Field(title='Location')] [Required]¶

name: Annotated[str, Field(title='Name')] [Required]¶

class avatars.models.JobKind(*values)[source]¶

standard = 'standard'¶

privacy_metrics = 'privacy_metrics'¶

signal_metrics = 'signal_metrics'¶

report = 'report'¶

advice = 'advice'¶

class avatars.models.JobStatus(*values)[source]¶

Status of a job in its lifecycle.

Typical happy-path order:: QUEUED → CREATED → PENDING → FINISHED
Error paths:: PARENT_ERROR (a dependency job failed) ERROR (the job itself failed) LOST (worker disappeared) ORPHANED (worker lost contact before running)

DEFAULT (“”) is the initial value before any status is assigned.

queued = 'queued'¶

created = 'created'¶

orphaned = 'orphaned'¶

parent_error = 'parent_error'¶

error = 'error'¶

finished = 'finished'¶

field_ = ''¶

pending = 'pending'¶

lost = 'lost'¶

class avatars.models.LoginResponse[source]¶

access_token: Annotated[str, Field(title='Access Token')] [Required]¶

refresh_token: Annotated[str | None, Field(title='Refresh Token')] = None¶

token_type: Annotated[str, Field(title='Token Type')] [Required]¶

class avatars.models.ResetPasswordRequest[source]¶

email: Annotated[str, Field(title='Email')] [Required]¶

new_password: Annotated[str, Field(title='New Password')] [Required]¶

new_password_repeated: Annotated[str, Field(title='New Password Repeated')] [Required]¶

token: Annotated[UUID, Field(title='Token')] [Required]¶

class avatars.models.ResourceSetResponse[source]¶

display_name: Annotated[str, Field(title='Display Name')] [Required]¶

set_name: Annotated[UUID, Field(title='Set Name')] [Required]¶

class avatars.models.UserRole(*values)[source]¶

admin = 'admin'¶

user = 'user'¶

class avatars.models.ValidationError[source]¶

ctx: Annotated[dict[str, Any] | None, Field(title='Context')] = None¶

input: Annotated[Any | None, Field(title='Input')] = None¶

loc: Annotated[list[str | int], Field(title='Location')] [Required]¶

msg: Annotated[str, Field(title='Message')] [Required]¶

type: Annotated[str, Field(title='Error Type')] [Required]¶

class avatars.models.GrantType[source]¶

root: Annotated[str, Field(pattern='^password$', title='Grant Type')] [Required]¶

Constraints:

pattern = ^password$

class avatars.models.Login[source]¶

client_id: Annotated[str | None, Field(title='Client Id')] = None¶

client_secret: Annotated[SecretStr | None, Field(title='Client Secret')] = None¶

grant_type: Annotated[GrantType | None, Field(title='Grant Type')] = None¶

password: Annotated[SecretStr, Field(title='Password')] [Required]¶

scope: Annotated[str | None, Field(title='Scope')] = ''¶

username: Annotated[str, Field(title='Username')] [Required]¶

class avatars.models.CompatibilityResponse[source]¶

message: Annotated[str, Field(title='Message')] [Required]¶

most_recent_compatible_client: Annotated[str | None, Field(title='Most Recent Compatible Client')] = None¶

status: CompatibilityStatus [Required]¶

class avatars.models.CreateUser[source]¶

Create a user with an email.

email: Annotated[str, Field(title='Email')] [Required]¶

password: Annotated[str | None, Field(title='Password')] = None¶

role: UserRole | None = UserRole.user¶

class avatars.models.FileAccess[source]¶

credentials: FileCredentials [Required]¶

url: Annotated[str, Field(title='Url')] [Required]¶

class avatars.models.HTTPValidationError[source]¶

detail: Annotated[list[ValidationError] | None, Field(title='Detail')] = None¶

class avatars.models.JobResponse[source]¶

created_at: Annotated[AwareDatetime, Field(title='Created At')] [Required]¶

deleted_at: Annotated[AwareDatetime | None, Field(title='Deleted At')] = None¶

display_name: Annotated[str, Field(title='Display Name')] [Required]¶

done: Annotated[bool, Field(title='Done')] [Required]¶

exception: Annotated[str, Field(title='Exception')] [Required]¶

kind: JobKind [Required]¶

name: Annotated[str, Field(title='Name')] [Required]¶

parameters_name: Annotated[str, Field(title='Parameters Name')] [Required]¶

progress: Annotated[float | None, Field(title='Progress')] [Required]¶

set_name: Annotated[UUID, Field(title='Set Name')] [Required]¶

status: JobStatus [Required]¶

class avatars.models.JobResponseList[source]¶

jobs: Annotated[list[JobResponse], Field(title='Jobs')] [Required]¶

class avatars.models.MeUser[source]¶

email: Annotated[str, Field(title='Email')] [Required]¶

environment: EnvironmentInfo [Required]¶

id: Annotated[UUID, Field(title='Id')] [Required]¶

organization_id: Annotated[UUID, Field(title='Organization Id')] [Required]¶

role: UserRole | None = UserRole.user¶

class avatars.models.User[source]¶

email: Annotated[str, Field(title='Email')] [Required]¶

id: Annotated[UUID, Field(title='Id')] [Required]¶

organization_id: Annotated[UUID, Field(title='Organization Id')] [Required]¶

role: UserRole | None = UserRole.user¶

class avatars.models.BulkDeleteResponse[source]¶

Response model for bulk job deletion.

deleted_jobs: Annotated[list[JobResponse], Field(title='Deleted Jobs')] [Required]¶

failed_jobs: Annotated[list[str], Field(title='Failed Jobs')] [Required]¶

class avatars.models.Processor(*args, **kwargs)[source]¶

preprocess(df: DataFrame) → DataFrame[source]¶

postprocess(source: DataFrame, dest: DataFrame) → DataFrame[source]¶

class avatar_yaml.models.parameters.AlignmentMethod(*values)[source]¶

Bases: str, Enum

SPECIFIED = 'specified'¶

MAX = 'max'¶

MIN = 'min'¶

MEAN = 'mean'¶

class avatar_yaml.models.parameters.ExcludeVariablesMethod(*values)[source]¶

Bases: str, Enum

The method to exclude column.

ROW_ORDER = 'row_order'¶: SENSITIVE The excluded column will be linked to the original row order. This is a violation of privacy.

COORDINATE_SIMILARITY = 'coordinate_similarity'¶: The excluded column will be linked by individual similarity.

class avatar_yaml.models.parameters.ImputeMethod(*values)[source]¶

Bases: str, Enum

KNN = 'knn'¶

MODE = 'mode'¶

MEDIAN = 'median'¶

MEAN = 'mean'¶

FAST_KNN = 'fast_knn'¶

class avatar_yaml.models.parameters.ProjectionType(*values)[source]¶

Bases: str, Enum

FPCA = 'fpca'¶

FLATTEN = 'flatten'¶

class avatar_yaml.models.schema.ColumnType(*values)[source]¶

Bases: StrEnum

INT = 'int'¶

BOOL = 'bool'¶

CATEGORY = 'category'¶

NUMERIC = 'float'¶

DATETIME = 'datetime'¶

DATETIME_TZ = 'datetime_tz'¶

class avatar_yaml.models.schema.LinkMethod(*values)[source]¶

Bases: StrEnum

Available assignment methods to link a child to its parent table after the anonymization.

LINEAR_SUM_ASSIGNMENT = 'linear_sum_assignment'¶: Assign using the linear sum assignment algorithm. This method is a good privacy and utility trade-off. The algorithm consumes lots of resources.

MINIMUM_DISTANCE_ASSIGNMENT = 'minimum_distance_assignment'¶: Assign using the minimum distance assignment algorithm. This method assigns the closest child to the parent. It is an acceptable privacy and utility trade-off. This algorithm consumes less resources than the linear sum assignment.

SENSITIVE_ORIGINAL_ORDER_ASSIGNMENT = 'sensitive_original_order_assignment'¶: Assign the child to the parent using the original order. WARNING!!! This method is a HIGH PRIVACY BREACH as it keeps the original order to assign the child to the parent. This method isn’t recommended for privacy reasons but consumes less resources than the other methods.

TIME_SERIES = 'time_series'¶: Specific assignment method for time series data. It is used to link time series data to the parent table.

class avatars.constants.PlotKind(*values)[source]¶

Bases: StrEnum

Available plot types for visualization.

CORRELATION = 'correlation'¶: A correlation heatmap of the original and avatar data.

CORRELATION_DIFFERENCE = 'correlation_difference'¶: A heatmap of the difference between the original and avatar data.

CONTRIBUTION = 'contribution'¶: A bar chart showing the contribution of each feature in the model.

PROJECTION_2D = '2d_projection'¶: A 2D projection of the original and avatar data.

PROJECTION_3D = '3d_projection'¶: A 3D projection of the original and avatar data.

DISTRIBUTION = 'distribution'¶: Distributions plot of the original and avatar data, there is a plot for each column.

AGGREGATE_STATS = 'aggregate_stats'¶: A table containing the mean and std of the original and avatar data (of the 10 first columns).

RAW_SERIES = 'raw_series'¶: A line plot of the original and avatar time series over time.

NORMALIZED_SERIES = 'normalized_series'¶: A line plot of the normalized original and avatar time series over time.

CLASS_PROJECTION_2D = 'class_projection_2d'¶: A 2D projection colored by the target class (only available with class balancing augmentation).

METRICS_SUMMARY = 'metrics_summary'¶: A summary table of privacy metrics.

class avatars.constants.Results(*values)[source]¶

Bases: StrEnum

ADVICE = 'advice'¶

SHUFFLED = 'shuffled'¶

UNSHUFFLED = 'unshuffled'¶

PRIVACY_METRICS = 'privacy_metrics'¶

SIGNAL_METRICS = 'signal_metrics'¶

REPORT_IMAGES = 'report_images'¶

PROJECTIONS_ORIGINAL = 'original_projections'¶

PROJECTIONS_AVATARS = 'avatar_projections'¶

METADATA = 'run_metadata'¶

REPORT = 'report'¶

META_PRIVACY_METRIC = 'meta_privacy_metric'¶

META_SIGNAL_METRIC = 'meta_signal_metric'¶

FIGURES = 'figures'¶

FIGURES_METADATA = 'figures_metadata'¶

PRIVACY_METRICS_SUMMARY = 'privacy_metrics_summary'¶

SIGNAL_METRICS_SUMMARY = 'signal_metrics_summary'¶

class avatar_yaml.models.parameters.AugmentationStrategy(*values)[source]¶

Bases: StrEnum

minority = 'minority'¶

not_majority = 'not_majority'¶

class avatar_yaml.models.avatar_metadata.SensitivityLevel(*values)[source]¶

Bases: str, Enum

Evaluation of the sensitivity level of the personal data being processed.

This assessment is based on factors such as the nature of the data and potential risks to data subjects. It applies to three categories of data:

Sensitive personal data (GDPR Art. 9): Special categories including health, racial/ethnic origin, political opinions, religious beliefs, trade union membership, genetic data, biometric data, sex life, or sexual orientation. These typically require VERY_HIGH or HIGH sensitivity levels.
Personal data (GDPR Art. 4): Any information relating to an identified or identifiable natural person (e.g., name, identification number, location data, online identifiers). Sensitivity level varies based on context and combination with other data.
Demographic data: Non-sensitive characteristics such as age, gender, geographic location, education level. These are typically LOW to MEDIUM sensitivity, but can increase when combined with other identifying information.

The sensitivity level should reflect potential harm to data subjects if the data were compromised or re-identified.

VERY_HIGH = 'Very High'¶

HIGH = 'High'¶

MEDIUM = 'Medium'¶

LOW = 'Low'¶

VERY_LOW = 'Very Low'¶

NEGLIGIBLE = 'Negligible'¶

UNDEFINED = 'Undefined'¶

class avatar_yaml.models.avatar_metadata.DataType(*values)[source]¶

Bases: str, Enum

Categories of personal data being processed, based on the context and sector of the data processing activity.

UNKNOWN = 'unknown'¶: The processing involves personal data of an unspecified type. The exact nature of the data has not been determined or categorized at this stage.

HEALTH = 'health'¶: The data processed originate from health-related datasets containing information on patients or study participants. These datasets typically include demographic, clinical, and behavioural variables, such as age, gender, diagnosis codes, treatment details, medical outcomes, and follow-up data.

HR = 'hr'¶: The personal data processed concern employees, job applicants, contractors, or trainees. The datasets generally include professional information such as identification data, employment history, remuneration details, performance evaluations, and training records. Certain datasets may also include information relating to health or diversity monitoring.

MOBILITY = 'mobility'¶: The personal data processed typically relate to users of transport systems, vehicle operators, or mobility service subscribers. These datasets may include identifiers, geolocation traces, timestamps, usage frequency, travel routes, and behavioural metrics. Depending on the context, they may also contain information derived from connected vehicles or smart ticketing systems.

INSURANCE = 'insurance'¶: The personal data processed typically relate to policyholders, beneficiaries, or claimants. The datasets may include demographic characteristics, contract details, claim histories, financial indicators, and, in some cases, health-related information.

FINANCE = 'finance'¶: The personal data processed concern clients, investors, account holders, or financial service users. Typical datasets may include identification data, transaction histories, account balances, income levels, credit ratings, and investment portfolios.In certain contexts, they may also contain data classified as sensitive, such as information revealing financial hardship or vulnerability.

EDUCATION = 'education'¶: The personal data processed relate to students, teachers, or administrative staff within educational institutions. The datasets may include demographic information, academic performance records, attendance logs, course enrolments, and, where relevant, special educational needs or socio-economic indicators.

class avatar_yaml.models.avatar_metadata.DataSubject(*values)[source]¶

Bases: str, Enum

Categories of individuals whose personal data are being processed, based on the context and purpose of the data processing activity.

UNKNOWN = 'unknown'¶

PATIENTS = 'patients'¶: The data subjects are patients whose personal data are processed in the context of medical research, healthcare provision, or clinical trials. Such data may include information directly or indirectly identifying individuals, together with health-related or demographic variables.

EMPLOYEES = 'employees'¶: The data subjects are employees, job applicants, or contractors whose personal data are processed for human resources management, organisational analysis, or workforce studies. Such data may encompass professional identifiers, career trajectories, remuneration details, performance indicators, and training records.

CLIENTS = 'clients'¶: The data subjects are clients, customers, or insured persons whose personal data are processed for the purposes of service provision, product analysis, or contractual performance. These data may include identifying information, transaction or claim histories, contact details, and, in some contexts, financial or health-related information.

USERS = 'users'¶: The data subjects are users of digital, public, or mobility services whose personal data are processed for analytical, operational, or optimisation purposes. The data may include identifiers, behavioural indicators, service usage patterns, or geolocation data.

STUDENTS = 'students'¶: The data subjects are students enrolled in educational institutions whose personal data are processed for pedagogical, administrative, or research purposes. The datasets may include demographic information, academic performance, attendance records, or socio-economic indicators.

class avatar_yaml.models.avatar_metadata.DataRecipient(*values)[source]¶

Bases: str, Enum

Categories of recipients for the anonymised data, based on their relationship to the Data Controller and the context of data sharing.

UNKNOWN = 'unknown'¶: The recipients of the anonymized data have not been specifically identified at this stage. The data recipient category will need to be determined to properly assess the privacy risks associated with data sharing and ensure appropriate safeguards are in place.

OPENDATA = 'opendata'¶: The recipients of the anonymised data are the general public, through publication in an open data repository or public research platform. Such dissemination aims to promote scientific collaboration, innovation, or public transparency. To guarantee full compliance with data protection requirements, the datasets released as open data have undergone an anonymisation process.

CONTRACTUAL_THIRDPARTY = 'contractual_thirdparty'¶: The recipients of the anonymised data are third parties with whom the Data Controller maintains a contractual relationship, such as research partners, insurers, data analytics firms, or other commercial entities. These transfers occur within a controlled legal framework ensuring compliance with the principles of confidentiality, data minimisation, and purpose limitation

INTERNAL = 'internal'¶: The recipients of the anonymised data are exclusively internal stakeholders of the Data Controller, such as authorised employees, researchers, or analysts operating within the same organisation. The synthetic datasets are used for internal analytical, research, or operational purposes, in strict compliance with the principles of data protection by design and by default

OUTSIDE_EU = 'outside_eu'¶: The recipients of the anonymised data are entities established outside the European Union, including international research institutions or commercial partners.

TRUSTED_THIRDPARTY = 'trusted_thirdparty'¶: The recipients of the anonymised data are trusted third parties operating under a contractual or institutional framework that ensures compliance with data protection and ethical standards. These may include subcontractors providing technical services, scientific publishers, or data repositories managing peer-reviewed research outputs. The sharing of anonymised datasets with such entities is governed by confidentiality agreements and data processing clauses that explicitly prohibit any attempt at re-identification.