conversion_project/PROJECT_STRUCTURE.md
2025-12-31 15:40:37 -05:00

5.4 KiB

AV1 Batch Video Transcoder - Project Structure

Overview

This project is a modular batch video transcoding system using NVIDIA's av1_nvenc codec with intelligent audio stream processing and resolution handling.

Architecture

Entry Point

  • main.py - CLI entry point with argument parsing
    • Loads configuration from config.xml
    • Initializes logger and CSV tracker
    • Dispatches to process_folder() for batch processing

Core Modules

core/config_helper.py

  • XML configuration parser
  • Returns dict with audio, encoding, and filter settings
  • Key Config:
    • audio.stereo.high/medium: Target bitrates for stereo audio (1080p/720p)
    • audio.multi_channel.low/medium: Target bitrates for multichannel audio
    • encode.cq: CQ values per content type (tv_720, tv_1080, movie_720, movie_1080, etc.)
    • encode.fallback: Bitrate fallback settings (900k/1080p, 650k/720p, etc.)
    • extensions: Video file types to process (mkv, mp4, etc.)
    • ignore_tags: Files to skip (trailer, sample, etc.)

core/logger_helper.py

  • Comprehensive logging to logs/conversion.log
  • Captures source/target specs, audio decisions, bitrate info
  • Separate handlers for console (INFO+) and file (DEBUG+)

core/audio_handler.py

  • calculate_stream_bitrate(input_file, stream_index): Extracts audio stream with ffmpeg -c copy, parses bitrate output, falls back to file size calculation
  • get_audio_streams(input_file): Detects all audio streams with robust bitrate calculation
  • choose_audio_bitrate(channels, bitrate_kbps, audio_config, is_1080_class): Returns (codec, target_bitrate) tuple
    • Stereo 1080p+: >192k → encode; ≤192k → preserve ("copy")
    • Stereo 720p: >160k → encode; ≤160k → preserve
    • Multichannel: Encodes to low (384k) or medium (448k) based on current bitrate

core/video_handler.py

  • get_source_resolution(input_file): ffprobe detection of video dimensions
  • determine_target_resolution(src_width, src_height, explicit_resolution): Smart resolution logic
    • If >1080p → scale to 1080p
    • Else → preserve source resolution
    • Can be overridden with explicit --r 480/720/1080 argument

core/encode_engine.py

  • run_ffmpeg(...): Main FFmpeg encoding orchestration
    • Builds command with av1_nvenc settings
    • Per-stream audio codec/bitrate decisions
    • Handles both CQ and Bitrate modes
    • Logs detailed before/after specs

core/process_manager.py

  • process_folder(...): Main batch processing loop
    • Classifies files as TV/Movie/Anime based on path
    • Detects per-file source resolution
    • Applies smart resolution defaults or explicit overrides
    • Handles CQ → Bitrate fallback if size threshold exceeded
    • Tracks results in conversion_tracker.csv
    • Deletes originals after successful encoding

Workflow

  1. User runs: python main.py "C:\Videos\TV\ShowName" --r 720 --m bitrate
  2. main.py parses args, loads config.xml
  3. process_manager iterates video files in folder
  4. For each file:
    • video_handler detects source resolution
    • audio_handler analyzes audio streams and calculates bitrates
    • encode_engine builds FFmpeg command with smart audio/resolution settings
    • FFmpeg encodes with per-stream audio decisions
    • tracker logs results to CSV
  5. logger captures all details to logs/conversion.log

Configuration Examples

Force 720p Bitrate Mode

python main.py "C:\Videos\TV\Show" --r 720 --m bitrate

Force 1080p with CQ=28

python main.py "C:\Videos\Movies" --cq 28 --r 1080

Smart Mode (preserve resolution, 4K→1080p)

python main.py "C:\Videos\Mixed"

Force 480p (for low-res content)

python main.py "C:\Videos\OldTV" --r 480

Audio Encoding Logic

Decision Tree

Stereo audio?
├─ YES + 1080p: [>192kbps] ENCODE to 192k AAC, [≤192k] COPY
├─ YES + 720p: [>160kbps] ENCODE to 160k AAC, [≤160k] COPY
└─ NO (Multichannel 6ch+): ENCODE to 384k (low) or 448k (medium) AAC

Rationale

  • Preserves high-quality original audio when already well-compressed
  • Re-encodes excessive bitrate audio to standard targets
  • Handles stereo at different resolutions appropriately
  • Normalizes multichannel to 6ch surround (5.1) or 2ch stereo

Files Modified/Created

New Modules (Session Work)

  • core/audio_handler.py - NEW
  • core/video_handler.py - NEW
  • core/encode_engine.py - NEW
  • core/process_manager.py - NEW
  • main.py - REFACTORED (524 lines → 70 lines)

Cleanup

  • Deleted core/ffmpeg_helper.py (code moved to audio_handler)
  • Deleted core/process_helper.py (empty)
  • Deleted core/tracker_helper.py (empty)

Enhanced

  • config.xml - Added <low>384000</low> to multi_channel audio
  • transcode.bat - Enhanced with job counting and status tracking
  • paths.txt - Queue format with --r and --m flags

Validation Checklist

  • All modules pass Pylance syntax check
  • All imports resolve correctly
  • Config loads and provides expected keys
  • No unused/deprecated files remain
  • Project structure clean and maintainable

Running Tests

Verify the complete system:

# Test imports
python -c "from core.audio_handler import *; from core.video_handler import *; print('OK')"

# Run on test folder
python main.py "C:\Test\Videos" --r 720 --m bitrate

# Check logs
cat logs/conversion.log