Alternative Detection and Provenance Systems for the AI Era
Hypothetical Framework — Prepared by Adservio Innovation Lab Olivier Vitrac (former Research Director, Université Paris-Saclay) For internal discussion — November 2025
This memo explores next-generation traceability architectures that could complement or replace traditional fingerprinting for music rights detection. Proposals range from enhanced signal processing (phase-domain signatures, perceptual hashing) to distributed ledger technologies (blockchain registries) and cryptographic watermarking. All concepts are presented as exploratory frameworks requiring validation through pilot programs.
| Challenge | Current System Failure | Required Capability |
|---|---|---|
| Pitch/tempo transformation | Fingerprint hash mismatch | Invariance to musical transformations |
| Stem recombination | No single-source match | Multi-source attribution |
| Generative synthesis | Zero signal overlap | Provenance audit of training data |
| Metadata stripping | ISRC/ISWC lost | Embedded, non-removable signatures |
| Cross-platform diffusion | No tracking across uploads | Global registry with hash anchoring |
Embed imperceptible digital signature directly into audio waveform at creation/publication time:
Survives lossy compression (MP3, AAC)
Survives pitch shift, time stretch (within limits)
Cannot be removed without perceptual degradation
Principle: Modulate low-amplitude pseudo-random noise sequence into audio signal
Mathematical Formulation:
where:
Detection: Correlate suspected audio with
embed
generates
AI transformation
detection
required
high correlation
Original Audio
Watermarked Audio
(+imperceptible noise)
Secret Key
Watermark Sequence w(t)
Remixed Audio
Correlation Test
✓ Watermark Detected
| Transformation | Watermark Survival | Notes |
|---|---|---|
| MP3 compression (128 kbps) | 95%+ | Designed for this |
| Pitch shift ±3 semitones | 70–85% | Requires frequency-adaptive embedding |
| Time stretch ±15% | 60–80% | Requires synchronization codes |
| Stem separation | 50–70% | Watermark may concentrate in one stem |
| Additive noise (SNR 20 dB) | 90%+ | Spread-spectrum is noise-robust |
| Re-recording (analog hole) | 30–50% | Weakest point; requires high SNR |
Scenario: UMG embeds watermarks in all new releases starting 2026
Implementation:
Embedding: During mastering, add watermark using proprietary key (e.g., per-album or per-track key)
Registry: Store hash of watermark key + ISRC on private blockchain (Layer 3)
Detection: Platforms (YouTube, TikTok) run watermark extraction on user uploads
Reporting: Detected watermarks trigger automatic reports to SACEM
Cost:
Embedding: ~€1–5 per track (one-time, integrated into mastering workflow)
Detection: Platform infrastructure (~€1M setup + €0.10 per 1000 detections)
Registry: ~€50k/year for private blockchain (see Section 4.4)
Benefit:
Survives most AI remixes (except extreme degradation)
Non-repudiable proof of origin (cryptographic)
Complements existing fingerprinting (dual-layer defense)
Motivation: Magnitude-only fingerprints ignore phase information; phase is sensitive to transformations but can be stabilized
Approach: Use instantaneous frequency (rate of phase change) and group delay (frequency-dependent time delay)
Mathematical Basis:
Instantaneous frequency:
Group delay:
Advantage: Pitch shift and time stretch alter
Implementation:
xxxxxxxxxx# Pseudocode for phase-based fingerprintimport librosaimport numpy as np
def phase_fingerprint(audio, sr=22050): # Compute STFT D = librosa.stft(audio) magnitude, phase = np.abs(D), np.angle(D)
# Compute instantaneous frequency phase_diff = np.diff(phase, axis=1) inst_freq = phase_diff / (2 * np.pi)
# Identify stable phase regions (peaks) stable_regions = detect_peaks(inst_freq)
# Generate hash from phase constellation fingerprint = hash_constellation(stable_regions) return fingerprintRobustness (hypothetical):
Pitch shift: Normalizable (detect shift factor, correct before hashing)
Time stretch: Partially robust (phase coherence preserved)
Compression: Moderate (phase is sensitive to quantization noise)
STFT
phase analysis
peak detection
hash
STFT
phase analysis
normalize shift
hash
compare
match
Audio Signal
Magnitude + Phase
Instantaneous Frequency
Phase Constellation
Phase Fingerprint
AI Remix
(pitch-shifted)
Magnitude + Phase
Instantaneous Frequency
(shifted)
Phase Constellation
(corrected)
Phase Fingerprint
(matches E)
✓ Detected
Motivation: Human perception is invariant to many transformations (pitch shift within octave, slight tempo change) → train neural network to mimic this
Approach:
Train autoencoder on large music dataset
Use latent representation (compressed embedding) as "perceptual fingerprint"
Similar-sounding tracks cluster in latent space
Architecture (simplified):
xxxxxxxxxxInput: Audio spectrogram (128 mel bins × 1000 frames)↓Encoder: Conv layers → 128-dim latent vector↓Decoder: Deconv layers → reconstruct spectrogram↓Loss: Reconstruction + perceptual loss (STFT distance)
Advantage: Learns to ignore irrelevant variations (pitch, tempo) while preserving identity
Challenge: Requires massive training set + continuous retraining as AI remixing techniques evolve
Deployment: Could be integrated into Content ID as "second-stage filter" (broad match → perceptual verification)
Motivation: If acoustic signal is destroyed, fall back to symbolic representation (melody as sequence of notes)
Approach:
Use AI transcription (e.g., Google's MT3, Spotify's Basic Pitch) to extract melody
Convert to pitch interval sequence: [+2, -1, +3, +1, ...] (semitone deltas)
Compare against SACEM's composition database (if available)
Robustness:
Pitch shift: Interval sequence unchanged (transposition-invariant)
Time stretch: Irrelevant (only pitch intervals matter)
Harmonization changes: Moderate impact (melody may be obscured)
Limitation: Requires SACEM to maintain symbolic scores for compositions (not universally available)
transcription
interval encoding
compare
match
stored
transcription
interval encoding
same as C
AI Remix Audio
Symbolic Melody
[C, D, E, G, A]
Interval Sequence
[+2, +2, +3, +2]
SACEM Composition DB
(interval sequences)
✓ Composition Detected
Original Composition
(key of C)
Remix in key of F
Melody: [F, G, A, C, D]
Interval Sequence
[+2, +2, +3, +2]
Current SACEM registry is centralized, opaque to rights holders
Blockchain provides tamper-proof audit trail and cryptographic proof of registration time
Clarification: This is not about "Web3 monetization" or NFTs. It's about using distributed ledger as forensic infrastructure.
Design:
SACEM maintains authoritative registry (legal/administrative continuity)
Blockchain stores cryptographic hashes of works + timestamps
Rights holders can independently verify registrations (trustless audit)
Private Blockchain
(Audit Layer)
SACEM (Authoritative)
register work
anchor hash
anchor hash
detection
lookup
verify integrity
verify timestamps
Work Registry
(ISWC, metadata, splits)
Royalty Distribution Logic
Block N: hash(ISRC_1, t_1)
Block N+1: hash(ISRC_2, t_2)
Block N+2: hash(ISRC_3, t_3)
Rights Holder
Platform
Query Registry
Independent Auditor
Blockchain Choice:
Not Ethereum (too expensive, public, slow)
Private consortium chain (e.g., Hyperledger Fabric, Polygon Edge)
Validators: SACEM, major publishers (UMG, Sony, Warner), platforms (Spotify, YouTube)
Throughput: 1000+ transactions/sec
Cost: ~€0.0001 per registration
Data Structure:
xxxxxxxxxx{ "block_number": 12345, "timestamp": "2025-11-05T14:32:00Z", "transactions": [ { "tx_id": "0xabc123...", "type": "work_registration", "hash": "sha256(ISRC + audio_fingerprint + watermark_key)", "metadata": { "ISRC": "FRZ123456789", "ISWC": "T-123.456.789-0", "registrant": "Universal Music Publishing" } } ]}Verification Flow:
xxxxxxxxxx# Pseudocode: Platform verifies work detected in uploaddef verify_work(detected_isrc, detected_audio): # Query SACEM database sacem_record = sacem_api.lookup(detected_isrc)
# Query blockchain for hash blockchain_record = blockchain_api.get_block_by_isrc(detected_isrc)
# Compute hash of detected audio computed_hash = sha256(detected_isrc + extract_fingerprint(detected_audio))
# Verify integrity if blockchain_record['hash'] == computed_hash: return {"verified": True, "timestamp": blockchain_record['timestamp']} else: return {"verified": False, "reason": "Hash mismatch"}Scenario: Two parties claim ownership of same melody
Classical Process:
SACEM reviews submissions → slow (months)
Relies on paper records, email timestamps (disputable)
Blockchain-Enhanced Process:
Query blockchain for earliest registration of melody hash
Cryptographic timestamp is non-repudiable
Dispute resolved in days (not months)
Setup Cost:
Blockchain infrastructure: €200k (one-time)
Integration with SACEM systems: €500k (one-time)
Annual maintenance: €50k
Operational Cost:
€0.0001 per work registration
100k new works/year → €10/year in transaction fees
Benefit:
Faster dispute resolution (save €100k+/year in legal fees)
Enhanced trust with artists/publishers (non-tangible)
Compliance with future EU AI Act transparency requirements
Current State:
AI models (Suno, Udio, MusicLM) trained on massive datasets
No disclosure of which works were included
Rights holders cannot prove their works were used
Proposed Solution: Require AI companies to:
Register training datasets with hash manifests
Anchor manifests on blockchain (tamper-proof)
Pay "training royalties" to rights holders (via SACEM or direct)
compiles
generate
anchor
audit
invoice training fee
prompt
generate
per-generation fee?
AI Music Company
(e.g., Suno)
Training Dataset
(1M tracks)
Dataset Manifest
(list of ISRCs + hashes)
Blockchain Registry
SACEM / Rights Holders
User
Synthetic Track
Method:
AI company computes fingerprint (or watermark hash) for each training track
Generates Merkle tree of hashes
Publishes Merkle root on blockchain
Verification:
Rights holder provides ISRC → AI company provides Merkle proof → verified if proof valid
Merkle Tree Structure:
xxxxxxxxxxRoot Hash/ \Hash(A+B) Hash(C+D)/ \ / \Hash(A) Hash(B) Hash(C) Hash(D)| | | |Track_1 Track_2 Track_3 Track_4
Advantage: Compact proof (log(N) size), tamper-evident
Current EU AI Act (2024):
High-risk AI systems must document training data
Music generation models are not classified as high-risk (as of 2024)
Proposed Amendment (Vivendi could advocate):
Classify music generation as "high-risk to IP rights"
Mandate dataset disclosure + training royalty payments
Enforcement: Platforms must verify AI-generated content comes from compliant models
Use blockchain smart contracts to automate royalty splits and payments:
Platform detects usage → triggers smart contract
Contract automatically splits payment (e.g., 50% publisher, 30% songwriter, 20% performer)
Settlement in real-time (not quarterly as with SACEM)
xxxxxxxxxx// Solidity-style pseudocode (simplified)contract MusicRoyalty { struct Work { string ISRC; address payable publisher; address payable songwriter; uint8 publisherShare; // percentage uint8 songwriterShare; }
mapping(string => Work) public works;
function registerWork( string memory ISRC, address payable publisher, address payable songwriter, uint8 pubShare, uint8 songShare ) public { works[ISRC] = Work(ISRC, publisher, songwriter, pubShare, songShare); }
function reportUsage(string memory ISRC) public payable { Work memory work = works[ISRC]; uint256 pubAmount = msg.value * work.publisherShare / 100; uint256 songAmount = msg.value * work.songwriterShare / 100;
work.publisher.transfer(pubAmount); work.songwriter.transfer(songAmount); }}Hybrid Model:
SACEM retains legal authority (contracts, disputes)
Smart contract handles settlement only (not rights management)
Monthly reconciliation between on-chain payments and SACEM records
Speed: Real-time payments (vs. quarterly SACEM distributions)
Transparency: All payments auditable on-chain
Reduced friction: No intermediary holding funds
Volatility: If settled in cryptocurrency (avoid by using stablecoins or fiat-pegged tokens)
Gas fees: Solved by using Layer 2 or private chain (€0.0001/transaction)
Legal recognition: Smart contract payouts must be recognized by tax authorities (ongoing policy work)
Settlement Phase
Royalty split calculated
Payments transferred (real-time)
SACEM receives audit trail
Monthly reconciliation
Usage Phase
User uploads remix/derivative
Platform runs multi-layer detection:
1. Acoustic fingerprint
2. Phase-domain signature
3. Watermark extraction
Platform queries blockchain registry
Smart contract triggered
Distribution Phase
Work uploaded to platforms
Platforms store reference fingerprints
Platforms install watermark detectors
Creation Phase
Artist creates work
Mastering studio embeds watermark
Publisher registers with SACEM
Hash anchored on blockchain
| Requirement | Watermarking | Phase-Domain | Perceptual Hash | Blockchain Registry | AI Training Audit |
|---|---|---|---|---|---|
| Pitch shift robustness | High | Very High | Very High | N/A | N/A |
| Tempo change robustness | Medium | High | Very High | N/A | N/A |
| Generative AI detection | Low* | Low | Low | N/A | High |
| Metadata-free detection | Very High | High | High | N/A | N/A |
| Tamper-proof provenance | High | Low | Low | Very High | Very High |
| Retroactive applicability | No† | Yes | Yes | Yes | No‡ |
| Deployment cost | Medium | High | Very High | Low | Low |
| Industry readiness | High | Low | Medium | Low | Very Low |
*Watermarking detects generative AI only if training data was watermarked †Cannot watermark existing releases (requires re-mastering) ‡Cannot audit past training datasets (but can enforce for future models)
Goal: Improve detection of AI remixes within existing infrastructure
Actions:
Pilot watermarking on new UMG releases (select high-value artists)
Partner with platforms to integrate watermark detection alongside Content ID
Establish private blockchain registry (Vivendi + SACEM + major publishers)
Investment: ~€1M (setup) + €200k/year (operations)
Goal: Expand detection coverage and automate settlements
Actions:
Scale watermarking to 100% of new releases
Deploy phase-domain fingerprinting as second-stage filter on YouTube, TikTok
Launch smart contract pilot for real-time royalty splits
Advocate for EU AI Act amendment (training data transparency)
Investment: ~€5M (cumulative)
Goal: Establish Vivendi as leader in AI-resilient IP protection
Actions:
Industry standard: Propose watermarking + blockchain as ISO/IEC standard for music traceability
AI training royalties: Secure regulatory mandate for training dataset disclosure + compensation
Cross-sector expansion: Apply model to video (Canal+), games (Vivendi Gaming)
Investment: ~€10M (cumulative)
| Risk | Mitigation |
|---|---|
| Watermarks defeated by adversarial AI | Use adaptive embedding (update keys annually) |
| Blockchain scalability limits | Use Layer 2 or sharding (proven for 10k+ TPS) |
| False positives in perceptual hashing | Combine with human review for high-value disputes |
| Risk | Mitigation |
|---|---|
| Platforms refuse to integrate | Regulatory pressure (EU Article 17 enforcement) |
| AI companies evade training audits | Mandate at model deployment (platform-level checks) |
| High deployment costs | Phase rollout; prioritize high-value catalog |
| Risk | Mitigation |
|---|---|
| Smart contracts not legally recognized | Hybrid model (SACEM retains legal authority) |
| GDPR concerns (blockchain immutability) | Store only hashes (not personal data) on-chain |
| Anti-trust (Vivendi dominance) | Open consortium (include Sony, Warner, independents) |
Watermarking is most mature and deployable today
Protects new releases, but not back catalog
Survives most AI transformations (except extreme)
Phase-domain / perceptual hashing offers best robustness
But requires significant R&D and platform buy-in
Blockchain provides provenance and audit layer
Cheap, trustless, but doesn't detect by itself
AI training audits address generative synthesis
Requires regulatory mandate (not yet in place)
Optimal strategy is multi-layered
Combine watermarking (embedded defense) + enhanced detection (phase/perceptual) + blockchain (provenance) + policy advocacy (training royalties)
Vivendi should champion a "defense-in-depth" approach:
Invest in watermarking (immediate protection)
Partner on phase-domain R&D (medium-term)
Lead blockchain consortium (establish standard)
Advocate for AI training transparency (long-term policy win)
Memo 5 will synthesize findings into a strategic roadmap with:
Discussion questions for CTO meeting
Pilot program proposals
Regulatory engagement strategy
Success metrics and timeline
End of Memo 4 Prepared by Adservio Innovation Lab — Hypothetical Framework Contact: olivier.vitrac@adservio.fr