A special area of modeling is that which involves generative models. This special kind of models is focused on generating data of a certain kind. Generative models are typically probabilistic and, if based on machine learning, able to learn from data.

Generative models can produce both structured and unstructured data. Structured data generation is sometimes used to generate synthetic data to tackle datasets with class imbalance (there are more instances of one class than of another class). Generating unstructured data is generally a very challenging task. Models exist e.g. for generating:

  • Acoustic data: e.g. in speech synthesis or music synthesis;
  • Visual data: e.g. generating images using advanced deep learning methods such as Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs).
  • Natural language: models trained to generate natural language on large datasets can be fine-tuned to perform other tasks such as classification on much smaller datasets.

An example: synthetic facial images generated by a StyleGAN generator [stylegan].

Literature

  1. [stylegan] Karras, T., Laine, S. and Aila, T., 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4401-4410).