Sunday, March 29, 2026

Generative Adversarial Networks Revisited: The Art of Creative Machine Learning




Welcome to an exploration of one of the most fascinating advancements in artificial intelligence: Generative Adversarial Networks, affectionately known as GANs. Imagine a world where machines can not only understand but also create new, original content that is indistinguishable from human-made creations. This is the promise and power of GANs, a revolutionary framework that has opened up new frontiers in machine learning.


The Core Idea: A Creative Forger and an Expert Detective


At its heart, a Generative Adversarial Network operates on a simple yet profound principle: competition. It involves two distinct neural networks, a "generator" and a "discriminator," locked in a continuous, adversarial game. Think of it like a skilled art forger trying to create a masterpiece that can fool an expert art detective. The forger constantly refines their technique, learning from their mistakes, while the detective sharpens their eye, becoming better at spotting fakes.


Let us illustrate this with a running example: imagine we want to train a machine to write realistic customer reviews for a fictional new product, say, a "Quantum Coffee Maker." Our goal is to generate reviews that sound so authentic, they could have been written by actual customers.


The Generator Network is our creative forger. Its job is to invent new customer reviews that appear genuine. Initially, it might produce nonsensical strings of words, like "coffee quantum good make machine." But over time, it learns from the feedback it receives, gradually improving its ability to craft believable sentences and sentiments.


The Discriminator Network is our expert detective. Its task is to examine any given review and determine whether it is a real review from an actual customer or a fake one conjured up by the generator. It is trained on a dataset of both real customer reviews and the fake reviews produced by the generator. Its ultimate aim is to become so adept that it can perfectly distinguish between the two.


How GANs Work: The Adversarial Dance


The training process of a GAN is an iterative dance between these two networks. They are trained simultaneously but with opposing goals.


In each training step, the generator first creates a batch of fake reviews from a random noise input. This noise acts as a creative spark, providing the generator with a unique starting point for each new review. Then, the discriminator is presented with a mix of these newly generated fake reviews and a batch of real customer reviews. The discriminator processes both sets and attempts to classify each review as either "real" or "fake."


The discriminator's performance is then evaluated. If it correctly identifies a real review as real and a fake review as fake, its internal parameters are adjusted to reinforce these correct classifications. If it makes a mistake, its parameters are adjusted to learn from that error. This process makes the discriminator a more discerning critic.


Following the discriminator's training, the generator gets its turn. The generator's goal is to fool the discriminator. It produces another batch of fake reviews, and these are then fed to the discriminator. However, during this phase, the discriminator's parameters are frozen; it acts as a fixed judge. The generator's parameters are adjusted based on how successful it was at tricking the discriminator. If the discriminator was fooled into thinking a fake review was real, the generator is rewarded and its internal mechanisms are adjusted to produce more reviews like that. If the discriminator easily spotted the fakes, the generator learns to improve its forgery skills.


This continuous back-and-forth, where the generator tries to improve its fakes and the discriminator tries to improve its detection, drives both networks to higher levels of performance. Eventually, the generator becomes so good at creating realistic reviews that the discriminator can no longer reliably tell the difference between real and fake ones, effectively guessing with 50 percent accuracy. At this point, the generator has learned to produce highly convincing synthetic data.


Constituents of a GAN: Broadly and Deeply


Let us delve into the specific components that make up these two powerful networks, using our customer review example.


The Generator Network: The Creative Forger


The generator's primary function is to transform a random input, often called a "latent space vector" or "noise vector," into a data instance that resembles the training data. For our review generation task, this means turning a numerical vector into a sequence of words that form a coherent and plausible customer review.


Input: The generator typically starts with a vector of random numbers, perhaps 100 dimensions long, sampled from a simple distribution like a uniform or normal distribution. This random vector serves as the creative seed for each unique review the generator will produce. Different random vectors should ideally lead to different generated reviews.


Output: The output of our generator will be a sequence of word identifiers, which can then be mapped back to actual words to form a review text. For instance, if our vocabulary assigns 'great' to 1, 'product' to 2, and 'love' to 3, the generator might output a sequence like [1, 2, 3] which translates to "great product love".


Architecture: For text generation, recurrent neural networks (RNNs) like Long Short-Term Memory (LSTM) units or Gated Recurrent Units (GRU) are commonly used because they are excellent at handling sequential data. The generator might start with a dense layer to expand the latent noise vector, followed by a series of LSTM or GRU layers to build up the sequence, and finally a dense layer with a softmax activation function over the entire vocabulary to output probabilities for each word at each position in the sequence.


Let us look at a simplified code snippet for building such a generator using Keras:



import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, LSTM, Reshape, Embedding, TimeDistributed


def build_generator(latent_dim, vocab_size, max_seq_length):

    """

    Constructs the generator model responsible for creating fake reviews.


    Parameters:

        latent_dim (int): The dimensionality of the random noise input vector.

        vocab_size (int): The total number of unique words in our vocabulary.

        max_seq_length (int): The maximum length of a generated review sequence.


    Returns:

        tf.keras.Model: The compiled generator model.

    """

    model = Sequential()


    # Takes the latent_dim noise vector and transforms it into a sequence-like

    # structure suitable for an LSTM layer. We'll start by expanding it.

    # The initial dense layer expands the latent vector to a size that can be

    # reshaped into a sequence of 'timesteps' for the LSTM.

    # For example, if latent_dim is 100, and we want 10 timesteps with 10 units each,

    # we'd expand to 100 then reshape to (10, 10).

    model.add(Dense(128 * max_seq_length, input_dim=latent_dim))

    model.add(Reshape((max_seq_length, 128))) # Reshape to (timesteps, features)


    # LSTM layer to process the sequence and learn sequential dependencies.

    # return_sequences=True ensures that the LSTM outputs a sequence for the next layer.

    model.add(LSTM(256, return_sequences=True))


    # TimeDistributed Dense layer applies a Dense layer to each timestep of the sequence.

    # This is crucial for generating a probability distribution over the entire vocabulary

    # for each word position in the review.

    model.add(TimeDistributed(Dense(vocab_size, activation='softmax')))


    return model


# Example usage (not part of the actual running example, just for illustration)

# generator_model = build_generator(latent_dim=100, vocab_size=1000, max_seq_length=10)

# generator_model.summary()


The `build_generator` function defines our generator. It takes a random noise vector, expands it, and then uses an LSTM layer to create a sequence of outputs. The `TimeDistributed(Dense(vocab_size, activation='softmax'))` layer is particularly important here; it ensures that for every position in the generated review sequence, the model outputs a probability distribution over all possible words in our vocabulary. The word with the highest probability is then chosen for that position.


The Discriminator Network: The Expert Detective


The discriminator's role is to evaluate incoming data and classify it as either real or fake. In our review example, it must distinguish between actual customer reviews and those fabricated by the generator.


Input: The discriminator receives sequences of word identifiers, representing either real or fake customer reviews.


Output: A single probability value between 0 and 1. A value close to 1 indicates the discriminator believes the input review is real, while a value close to 0 suggests it believes the review is fake.


Architecture: For text classification, the discriminator also often uses recurrent layers like LSTMs or GRUs, possibly combined with embedding layers to convert word identifiers into dense vector representations. The network typically ends with a dense layer and a sigmoid activation function to output the single probability.


Here is a simplified code snippet for building our discriminator:


import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, LSTM, Embedding, Flatten, Dropout


def build_discriminator(vocab_size, max_seq_length):

    """

    Constructs the discriminator model responsible for classifying reviews as real or fake.


    Parameters:

        vocab_size (int): The total number of unique words in our vocabulary.

        max_seq_length (int): The maximum length of a review sequence.


    Returns:

        tf.keras.Model: The compiled discriminator model.

    """

    model = Sequential()


    # Embedding layer converts word indices into dense vectors.

    # input_length specifies the expected length of input sequences.

    model.add(Embedding(vocab_size, 128, input_length=max_seq_length))


    # LSTM layer processes the sequence of word embeddings.

    # We don't need return_sequences=True here as we only care about the final

    # classification for the entire sequence.

    model.add(LSTM(256))


    # Dropout layer helps prevent overfitting by randomly setting a fraction of input

    # units to 0 at each update during training time, which helps prevent co-adaptation

    # of neurons.

    model.add(Dropout(0.3))


    # Final dense layer with sigmoid activation outputs a single probability

    # indicating whether the review is real (close to 1) or fake (close to 0).

    model.add(Dense(1, activation='sigmoid'))


    # Compile the discriminator with an optimizer and a loss function.

    # Binary cross-entropy is standard for binary classification.

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

    return model


# Example usage (not part of the actual running example, just for illustration)

# discriminator_model = build_discriminator(vocab_size=1000, max_seq_length=10)

# discriminator_model.summary()



The `build_discriminator` function creates a model that takes a sequence of word IDs, converts them into meaningful embeddings, processes them with an LSTM, and then outputs a single probability. This probability tells us how confident the discriminator is that the input review is real.


The Training Loop: Orchestrating the Competition


The training loop is where the generator and discriminator engage in their adversarial game. It involves alternating updates to each network's weights.


import numpy as np

import tensorflow as tf

from tensorflow.keras.models import Model

from tensorflow.keras.optimizers import Adam

from tensorflow.keras.preprocessing.text import Tokenizer

from tensorflow.keras.preprocessing.sequence import pad_sequences


# Assume build_generator and build_discriminator functions are defined as above


def build_gan(generator, discriminator):

    """

    Combines the generator and discriminator into a single GAN model for training the generator.


    During generator training, the discriminator's weights are frozen.

    The GAN model takes random noise as input and outputs the discriminator's

    classification of the generated output.


    Parameters:

        generator (tf.keras.Model): The generator model.

        discriminator (tf.keras.Model): The discriminator model.


    Returns:

        tf.keras.Model: The compiled GAN model.

    """

    # Make the discriminator non-trainable when training the generator

    discriminator.trainable = False


    # Connect the generator output to the discriminator input

    gan_output = discriminator(generator.output)


    # Define the GAN model: input is generator's input, output is discriminator's output

    gan_model = Model(inputs=generator.input, outputs=gan_output)


    # Compile the GAN model. The loss here is for the generator, trying to make

    # the discriminator output 'real' (label 1) for its fake samples.

    gan_model.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.0002, beta_1=0.5))

    return gan_model


def train_gan(generator, discriminator, gan_model, real_reviews_sequences,

              latent_dim, n_epochs=100, batch_size=64, vocab_size=None,

              max_seq_length=None, tokenizer=None):

    """

    Trains the Generative Adversarial Network.


    Parameters:

        generator (tf.keras.Model): The generator model.

        discriminator (tf.keras.Model): The discriminator model.

        gan_model (tf.keras.Model): The combined GAN model.

        real_reviews_sequences (np.array): Preprocessed real review sequences.

        latent_dim (int): Dimensionality of the generator's noise input.

        n_epochs (int): Number of training epochs.

        batch_size (int): Size of batches for training.

        vocab_size (int): Total number of unique words in the vocabulary.

        max_seq_length (int): Maximum length of a review sequence.

        tokenizer (tf.keras.preprocessing.text.Tokenizer): The tokenizer used for text.

    """

    half_batch = batch_size // 2


    # Labels for real and fake samples

    # Smooth labels are often used for discriminator to prevent overconfidence

    real_labels = np.ones((half_batch, 1)) * 0.9 # Label smoothing for real samples

    fake_labels = np.zeros((half_batch, 1)) + 0.1 # Label smoothing for fake samples


    for epoch in range(n_epochs):

        # ---------------------

        #  Train Discriminator

        # ---------------------


        # Select a random half_batch of real reviews

        idx = np.random.randint(0, real_reviews_sequences.shape[0], half_batch)

        real_reviews = real_reviews_sequences[idx]


        # Generate a half_batch of fake reviews

        noise = np.random.normal(0, 1, (half_batch, latent_dim))

        generated_reviews_indices = generator.predict(noise)


        # Convert generated indices to one-hot or categorical if needed,

        # but for simple text, we might just take argmax to get word IDs.

        # For simplicity, let's assume generator.predict directly gives word IDs.

        # In a real scenario, this would involve sampling from softmax output.

        # For this example, we'll round the probabilities to get integer word IDs

        # and ensure they are within vocab_size.

        generated_reviews_indices = np.argmax(generated_reviews_indices, axis=-1)

        # Ensure generated indices are within valid range (0 to vocab_size-1)

        generated_reviews_indices = np.clip(generated_reviews_indices, 0, vocab_size - 1)



        # Train the discriminator

        d_loss_real = discriminator.train_on_batch(real_reviews, real_labels)

        d_loss_fake = discriminator.train_on_batch(generated_reviews_indices, fake_labels)

        d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)


        # ---------------------

        #  Train Generator

        # ---------------------


        # Generate a batch of noise vectors for the generator

        noise = np.random.normal(0, 1, (batch_size, latent_dim))


        # The generator wants the discriminator to classify its fakes as real (label 1)

        valid_y = np.ones((batch_size, 1))


        # Train the generator (via the combined GAN model)

        g_loss = gan_model.train_on_batch(noise, valid_y)


        # Print progress

        print(f"Epoch {epoch}/{n_epochs} [D loss: {d_loss[0]:.4f}, acc.: {100*d_loss[1]:.2f}%] [G loss: {g_loss:.4f}]")


        # Optionally, generate and print a sample review every few epochs

        if epoch % 100 == 0:

            sample_noise = np.random.normal(0, 1, (1, latent_dim))

            generated_sample_indices = generator.predict(sample_noise)

            generated_sample_indices = np.argmax(generated_sample_indices, axis=-1)

            generated_sample_indices = np.clip(generated_sample_indices, 0, vocab_size - 1)


            # Convert indices back to words for human-readable output

            if tokenizer:

                # Create a reverse word map for decoding

                reverse_word_map = dict(map(reversed, tokenizer.word_index.items()))

                def sequence_to_text(sequence):

                    words = [reverse_word_map.get(idx, '<UNK>') for idx in sequence if idx != 0] # Filter out padding

                    return ' '.join(words)

                sample_review_text = sequence_to_text(generated_sample_indices[0])

                print(f"  Generated sample review: '{sample_review_text}'")

            else:

                print(f"  Generated sample review indices: {generated_sample_indices[0]}")


# Note: The actual execution of train_gan with real data and a tokenizer

# will be in the addendum. This snippet focuses on the loop logic.



The `train_gan` function orchestrates the entire training process. It iteratively:

1.  Trains the discriminator: It takes half a batch of real reviews and half a batch of fake reviews generated by the current generator. It then updates the discriminator's weights to better distinguish between them.

2.  Trains the generator: It generates a full batch of fake reviews and tries to fool the discriminator into classifying them as real. During this step, only the generator's weights are updated, while the discriminator acts as a fixed judge.


Why GANs are Useful: Beyond Fake Reviews


While our example focuses on generating customer reviews, the applications of Generative Adversarial Networks extend far beyond this niche. Their ability to generate realistic data makes them incredibly versatile.


One prominent use is in image generation. GANs can create hyper-realistic images of faces, landscapes, animals, or objects that have never existed. This capability is used in fields like entertainment for creating virtual characters or in design for generating new product concepts.


Another significant application is image-to-image translation. This involves transforming an image from one domain to another. Examples include converting sketches into photorealistic images, changing day scenes to night scenes, or even altering facial expressions while preserving identity.


GANs are also invaluable for data augmentation. In scenarios where real training data is scarce, GANs can generate synthetic data that closely mimics the real data, thereby expanding the training set and improving the performance of other machine learning models. For instance, in medical imaging, GANs can create more examples of rare disease conditions.


Furthermore, they contribute to super-resolution, enhancing the quality of low-resolution images by generating missing details, and in drug discovery, where they can propose new molecular structures with desired properties. The creative capacity of GANs is constantly being explored, leading to novel solutions across various industries.


Challenges and Considerations


Despite their power, training GANs can be challenging. One common issue is mode collapse, where the generator learns to produce only a limited variety of outputs, even if the real data is diverse. For instance, our review generator might only learn to write positive reviews, ignoring negative or neutral sentiments. This happens when the generator finds a few samples that consistently fool the discriminator and stops exploring the full data distribution.


Another challenge is training stability. GANs are notoriously difficult to train because of the delicate balance required between the generator and discriminator. If one network becomes too powerful too quickly, the training can become unstable, leading to poor results. This often requires careful tuning of hyperparameters and architectural choices.


Finally, evaluation metrics for GANs are still an active area of research. It is hard to quantitatively assess the "realism" or "diversity" of generated content, especially for complex data like text or images. Human evaluation often remains a crucial, albeit subjective, method.


Conclusion


Generative Adversarial Networks represent a monumental leap in machine learning, enabling machines to move beyond mere analysis and into the realm of creation. By pitting two neural networks against each other in an adversarial game, GANs learn to generate highly realistic and diverse synthetic data. From crafting compelling fake customer reviews to conjuring photorealistic images and aiding scientific discovery, their potential continues to unfold. While challenges like mode collapse and training stability persist, the ongoing research and innovation in GANs promise even more astonishing applications in the years to come, further blurring the lines between artificial and authentic creation.


Addendum: Full Running Example Code for Generating Fake Customer Reviews


This complete script demonstrates how to build and train a simple GAN to generate short, fake customer reviews for our "Quantum Coffee Maker" example.



import numpy as np

import tensorflow as tf

from tensorflow.keras.models import Sequential, Model

from tensorflow.keras.layers import Dense, LSTM, Reshape, Embedding, TimeDistributed, Dropout

from tensorflow.keras.optimizers import Adam

from tensorflow.keras.preprocessing.text import Tokenizer

from tensorflow.keras.preprocessing.sequence import pad_sequences

import os


# --- Configuration Parameters ---

VOCAB_SIZE = 50        # Maximum number of unique words in our vocabulary

MAX_SEQ_LENGTH = 10    # Maximum length of a review sequence (e.g., "great product love it" is 4 words)

LATENT_DIM = 100       # Dimension of the random noise input to the generator

N_EPOCHS = 2000        # Number of training iterations

BATCH_SIZE = 64        # Number of samples per training batch

BUFFER_SIZE = 10000    # For shuffling dataset


# --- 1. Define Generator Model ---

def build_generator(latent_dim, vocab_size, max_seq_length):

    """

    Constructs the generator model responsible for creating fake reviews.

    It takes a random noise vector and transforms it into a sequence of word indices.

    """

    model = Sequential(name="generator")


    # Initial dense layer to expand the latent noise vector.

    # We expand it to a size that can be reshaped into a sequence of 'timesteps'

    # for the LSTM layer. The 128 here is an arbitrary feature dimension per timestep.

    model.add(Dense(128 * max_seq_length, input_dim=latent_dim))

    model.add(Reshape((max_seq_length, 128))) # Reshape to (timesteps, features)


    # LSTM layer to process the sequence and learn sequential dependencies.

    # return_sequences=True ensures that the LSTM outputs a sequence for the next layer,

    # which is necessary when generating a sequence of words.

    model.add(LSTM(256, return_sequences=True))


    # TimeDistributed Dense layer applies a Dense layer to each timestep of the sequence.

    # This is crucial for generating a probability distribution over the entire vocabulary

    # for each word position in the review. The softmax activation converts these

    # into probabilities.

    model.add(TimeDistributed(Dense(vocab_size, activation='softmax')))


    return model


# --- 2. Define Discriminator Model ---

def build_discriminator(vocab_size, max_seq_length):

    """

    Constructs the discriminator model responsible for classifying reviews as real or fake.

    It takes a sequence of word indices and outputs a single probability.

    """

    model = Sequential(name="discriminator")


    # Embedding layer converts integer word indices into dense vectors.

    # This helps the model understand semantic relationships between words.

    # input_length specifies the expected length of input sequences.

    model.add(Embedding(vocab_size, 128, input_length=max_seq_length))


    # LSTM layer processes the sequence of word embeddings to capture context.

    # We don't need return_sequences=True here as we only care about the final

    # classification for the entire sequence, not a sequence of outputs.

    model.add(LSTM(256))


    # Dropout layer helps prevent overfitting by randomly setting a fraction of input

    # units to 0 during training, which encourages the network to learn more robust features.

    model.add(Dropout(0.3))


    # Final dense layer with sigmoid activation outputs a single probability

    # indicating whether the review is real (close to 1) or fake (close to 0).

    model.add(Dense(1, activation='sigmoid'))


    # Compile the discriminator with an Adam optimizer and binary cross-entropy loss,

    # which is standard for binary classification tasks.

    discriminator_optimizer = Adam(learning_rate=0.0002, beta_1=0.5)

    model.compile(loss='binary_crossentropy', optimizer=discriminator_optimizer, metrics=['accuracy'])

    return model


# --- 3. Define Combined GAN Model ---

def build_gan(generator, discriminator):

    """

    Combines the generator and discriminator into a single GAN model for training the generator.

    When training the GAN, the discriminator's weights are frozen.

    """

    # Make the discriminator non-trainable when training the generator.

    # This is crucial: we only want to update the generator's weights based on

    # how well it fools the *current* discriminator.

    discriminator.trainable = False


    # Connect the generator output to the discriminator input.

    gan_output = discriminator(generator.output)


    # Define the GAN model: input is the generator's input (noise),

    # and the output is the discriminator's classification of the generated data.

    gan_model = Model(inputs=generator.input, outputs=gan_output, name="gan")


    # Compile the GAN model. The loss here is for the generator, which tries to make

    # the discriminator output 'real' (label 1) for its fake samples.

    gan_optimizer = Adam(learning_rate=0.0002, beta_1=0.5)

    gan_model.compile(loss='binary_crossentropy', optimizer=gan_optimizer)

    return gan_model


# --- 4. Prepare Real Data (Simulated) ---

def load_real_samples(vocab_size, max_seq_length):

    """

    Simulates loading and preprocessing real customer review data.

    In a real scenario, this would load actual text files.

    """

    # A small set of example real reviews for our "Quantum Coffee Maker"

    real_reviews_text = [

        "This quantum coffee maker is great, love the speed.",

        "Best coffee machine ever, highly recommend to everyone.",

        "Excellent product, makes amazing coffee every morning.",

        "I love my new quantum coffee maker, it's a game changer.",

        "Simply fantastic, the coffee tastes out of this world.",

        "Highly satisfied with this purchase, worth every penny.",

        "The best quantum coffee experience I've had so far.",

        "Five stars for this innovative coffee maker.",

        "My morning routine is so much better with this machine.",

        "A must-have for any coffee enthusiast, truly revolutionary."

    ]


    # Initialize a tokenizer to convert words to integers

    # oov_token handles out-of-vocabulary words

    tokenizer = Tokenizer(num_words=vocab_size, oov_token="<unk>")

    tokenizer.fit_on_texts(real_reviews_text)


    # Convert text reviews to sequences of integers

    sequences = tokenizer.texts_to_sequences(real_reviews_text)


    # Pad sequences to ensure all reviews have the same length

    # 'post' padding adds zeros at the end, 'pre' adds at the beginning

    padded_sequences = pad_sequences(sequences, maxlen=max_seq_length, padding='post')


    print(f"Loaded {len(real_reviews_text)} real review samples.")

    print(f"Vocabulary size: {len(tokenizer.word_index) + 1}")

    print(f"Example real sequence (padded): {padded_sequences[0]}")


    return padded_sequences, tokenizer


# --- 5. Helper Functions for Training ---

def generate_latent_points(latent_dim, n_samples):

    """

    Generates random noise vectors as input for the generator.

    """

    # Generate points in the latent space (e.g., from a normal distribution)

    x_input = np.random.normal(0, 1, (n_samples, latent_dim))

    return x_input


def generate_fake_samples(generator, latent_dim, n_samples, vocab_size):

    """

    Uses the generator to create fake review sequences.

    """

    # Generate random points in latent space

    x_input = generate_latent_points(latent_dim, n_samples)

    # Predict word probabilities using the generator

    X_raw_output = generator.predict(x_input, verbose=0)

    # Convert probabilities to discrete word indices by taking the argmax

    # We clip to ensure indices are within the valid vocab_size range.

    X = np.argmax(X_raw_output, axis=-1)

    X = np.clip(X, 0, vocab_size - 1) # Ensure indices are valid

    # Create 'fake' labels (0) for these generated samples

    y = np.zeros((n_samples, 1)) + 0.1 # Label smoothing

    return X, y


def summarize_performance(epoch, generator, discriminator, real_reviews_sequences,

                          latent_dim, n_samples=100, vocab_size=None,

                          max_seq_length=None, tokenizer=None):

    """

    Evaluates and prints the performance of the GAN at given intervals.

    Generates sample reviews and displays them.

    """

    # Evaluate the discriminator on real samples

    x_real, y_real = real_reviews_sequences, np.ones((len(real_reviews_sequences), 1)) * 0.9

    _, acc_real = discriminator.evaluate(x_real, y_real, verbose=0)


    # Generate fake samples and evaluate the discriminator on them

    x_fake, y_fake = generate_fake_samples(generator, latent_dim, n_samples, vocab_size)

    _, acc_fake = discriminator.evaluate(x_fake, y_fake, verbose=0)


    # Summarize discriminator performance

    print(f"Discriminator Accuracy: Real={acc_real*100:.2f}%, Fake={acc_fake*100:.2f}%")


    # Generate and print a few sample reviews

    print("--- Sample Generated Reviews ---")

    sample_noise = generate_latent_points(latent_dim, 3) # Generate 3 samples

    generated_samples_indices = generator.predict(sample_noise, verbose=0)

    generated_samples_indices = np.argmax(generated_samples_indices, axis=-1)

    generated_samples_indices = np.clip(generated_samples_indices, 0, vocab_size - 1)


    reverse_word_map = dict(map(reversed, tokenizer.word_index.items()))

    def sequence_to_text(sequence):

        # Filter out padding (0) and unknown tokens (<UNK>) for cleaner output

        words = [reverse_word_map.get(idx, '<UNK>') for idx in sequence if idx != 0 and reverse_word_map.get(idx, '<UNK>') != '<unk>']

        return ' '.join(words)


    for i, seq_indices in enumerate(generated_samples_indices):

        review_text = sequence_to_text(seq_indices)

        print(f"  Sample {i+1}: '{review_text}'")

    print("------------------------------")



# --- 6. Main Training Function ---

def train_gan_model(generator, discriminator, gan_model, real_reviews_sequences,

                   latent_dim, n_epochs, batch_size, vocab_size, max_seq_length, tokenizer):

    """

    Trains the Generative Adversarial Network.

    """

    half_batch = batch_size // 2


    # Create dataset from real reviews for efficient batching

    dataset = tf.data.Dataset.from_tensor_slices(real_reviews_sequences).shuffle(BUFFER_SIZE).batch(half_batch)


    # Labels for real and fake samples (with label smoothing)

    real_labels = np.ones((half_batch, 1)) * 0.9

    fake_labels = np.zeros((half_batch, 1)) + 0.1


    for epoch in range(n_epochs):

        for i, real_batch in enumerate(dataset):

            # ---------------------

            #  Train Discriminator

            # ---------------------


            # Generate a half_batch of fake reviews

            noise = generate_latent_points(latent_dim, half_batch)

            generated_reviews_indices, _ = generate_fake_samples(generator, latent_dim, half_batch, vocab_size)


            # Train the discriminator on real and fake samples

            # Note: real_batch is already a tf.Tensor, generated_reviews_indices is np.array

            d_loss_real = discriminator.train_on_batch(real_batch, real_labels)

            d_loss_fake = discriminator.train_on_batch(generated_reviews_indices, fake_labels)

            d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)


            # ---------------------

            #  Train Generator

            # ---------------------


            # Generate a full batch of noise vectors for the generator

            noise = generate_latent_points(latent_dim, batch_size)


            # The generator wants the discriminator to classify its fakes as real (label 1)

            valid_y = np.ones((batch_size, 1))


            # Train the generator (via the combined GAN model)

            # The discriminator's weights are frozen during this step due to build_gan setup

            g_loss = gan_model.train_on_batch(noise, valid_y)


            # Print progress for the current batch

            if i % 10 == 0: # Print every 10 batches

                print(f"Epoch {epoch+1}/{n_epochs}, Batch {i+1} "

                      f"[D loss: {d_loss[0]:.4f}, acc.: {100*d_loss[1]:.2f}%] "

                      f"[G loss: {g_loss:.4f}]")


        # Summarize performance and generate samples at the end of each epoch

        if (epoch + 1) % 100 == 0 or epoch == 0: # Print summary every 100 epochs or at epoch 0

            print(f"\n--- Epoch {epoch+1} Summary ---")

            summarize_performance(epoch, generator, discriminator, real_reviews_sequences,

                                  latent_dim, n_samples=10, vocab_size=vocab_size,

                                  max_seq_length=max_seq_length, tokenizer=tokenizer)

            print("-----------------------------\n")


# --- Main Execution Block ---

if __name__ == "__main__":

    # Load and preprocess real review data

    real_reviews_sequences, tokenizer = load_real_samples(VOCAB_SIZE, MAX_SEQ_LENGTH)

    actual_vocab_size = len(tokenizer.word_index) + 1 # +1 for padding token


    # Build the discriminator

    discriminator = build_discriminator(actual_vocab_size, MAX_SEQ_LENGTH)

    discriminator.summary()


    # Build the generator

    generator = build_generator(LATENT_DIM, actual_vocab_size, MAX_SEQ_LENGTH)

    generator.summary()


    # Build the combined GAN model

    gan_model = build_gan(generator, discriminator)

    gan_model.summary()


    # Train the GAN

    print("\nStarting GAN training...")

    train_gan_model(generator, discriminator, gan_model, real_reviews_sequences,

                   LATENT_DIM, N_EPOCHS, BATCH_SIZE, actual_vocab_size, MAX_SEQ_LENGTH, tokenizer)


    print("\nGAN training finished.")

    print("Final performance summary:")

    summarize_performance(N_EPOCHS, generator, discriminator, real_reviews_sequences,

                          LATENT_DIM, n_samples=10, vocab_size=actual_vocab_size,

                          max_seq_length=MAX_SEQ_LENGTH, tokenizer=tokenizer)


    # Optional: Save the generator model for future use

    # generator.save('review_generator.h5')

    # print("\nGenerator model saved as 'review_generator.h5'")


No comments: