Newer
Older
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>API · SpeechDatasets</title><meta name="title" content="API · SpeechDatasets"/><meta property="og:title" content="API · SpeechDatasets"/><meta property="twitter:title" content="API · SpeechDatasets"/><meta name="description" content="Documentation for SpeechDatasets."/><meta property="og:description" content="Documentation for SpeechDatasets."/><meta property="twitter:description" content="Documentation for SpeechDatasets."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.svg" alt="SpeechDatasets logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">SpeechDatasets</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../installation/">Installation</a></li><li><a class="tocitem" href="../examples/">Examples</a></li><li class="is-active"><a class="tocitem" href>API</a><ul class="internal"><li><a class="tocitem" href="#Load-a-Dataset"><span>Load a Dataset</span></a></li><li><a class="tocitem" href="#Types"><span>Types</span></a></li><li><a class="tocitem" href="#Lexicons"><span>Lexicons</span></a></li><li><a class="tocitem" href="#Index"><span>Index</span></a></li></ul></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>API</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>API</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl" title="View the repository"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">Repository</span></a><a class="docs-navbar-link" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/main/docs/src/api.md" title="Edit source"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="API"><a class="docs-heading-anchor" href="#API">API</a><a id="API-1"></a><a class="docs-heading-anchor-permalink" href="#API" title="Permalink"></a></h1><h2 id="Load-a-Dataset"><a class="docs-heading-anchor" href="#Load-a-Dataset">Load a Dataset</a><a id="Load-a-Dataset-1"></a><a class="docs-heading-anchor-permalink" href="#Load-a-Dataset" title="Permalink"></a></h2><p>To get data from a supported dataset, you only need one function:</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.dataset" href="#SpeechDatasets.dataset"><code>SpeechDatasets.dataset</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">dataset(dataset, inputdir::AbstractString, outputdir::AbstractString; kwargs...)</code></pre><p>Create a <a href="#SpeechDataset"><code>SpeechDataset</code></a> object for <code>dataset</code>. <code>inputdir</code> is the directory containing the raw data. If the <code>inputdir</code> does not exist and the data is freely available, it will be automatically downloaded and put in <code>inputdir</code>. <code>outputdir</code> is the directory where will be stored summary files. <code>kwargs</code>... are dataset specific arguments passed to <code>dataset</code></p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/dataset.jl#L74-L82">source</a></section></article><h2 id="Types"><a class="docs-heading-anchor" href="#Types">Types</a><a id="Types-1"></a><a class="docs-heading-anchor-permalink" href="#Types" title="Permalink"></a></h2><h3 id="SpeechDataset"><a class="docs-heading-anchor" href="#SpeechDataset">SpeechDataset</a><a id="SpeechDataset-1"></a><a class="docs-heading-anchor-permalink" href="#SpeechDataset" title="Permalink"></a></h3><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.SpeechDataset" href="#SpeechDatasets.SpeechDataset"><code>SpeechDatasets.SpeechDataset</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">SpeechDataset</code></pre><p>Store metadata about a speech dataset.</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/dataset.jl#L4-L8">source</a></section></article><p>Access a single element with integer or id indexing</p><pre><code class="language-julia hljs"># ds::SpeechDataset
ds["1988-147956-0027"]</code></pre><h3 id="Manifest-items"><a class="docs-heading-anchor" href="#Manifest-items">Manifest items</a><a id="Manifest-items-1"></a><a class="docs-heading-anchor-permalink" href="#Manifest-items" title="Permalink"></a></h3><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.ManifestItem" href="#SpeechDatasets.ManifestItem"><code>SpeechDatasets.ManifestItem</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">abstract type ManifestItem end</code></pre><p>Base class for all manifest item. Every manifest item should have an <code>id</code> attribute.</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/manifest_item.jl#L3-L8">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.Recording" href="#SpeechDatasets.Recording"><code>SpeechDatasets.Recording</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">struct Recording{Ts<:AbstractAudioSource} <: ManifestItem
id::AbstractString
source::Ts
channels::Vector{Int}
samplerate::Int
end</code></pre><p>A recording is an audio source associated with and id.</p><p><strong>Constructors</strong></p><pre><code class="nohighlight hljs">Recording(id, source, channels, samplerate)
Recording(id, source[; channels = missing, samplerate = missing])</code></pre><p>If the channels or the sample rate are not provided then they will be read from <code>source</code>.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>When preparing large corpus, not providing the channels and/or the sample rate can drastically reduce the speed as it forces to read source.</p></div></div></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/manifest_item.jl#L11-L32">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.Annotation" href="#SpeechDatasets.Annotation"><code>SpeechDatasets.Annotation</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">struct Annotation <: ManifestItem
id::AbstractString
recording_id::AbstractString
start::Float64
duration::Float64
channel::Union{Vector, Colon}
data::Dict
end</code></pre><p>An "annotation" defines a segment of a recording on a single channel. The <code>data</code> field is an arbitrary dictionary holdin the nature of the annotation. <code>start</code> and <code>duration</code> (in seconds) defines, where the segment is locatated within the recoding <code>recording_id</code>.</p><p><strong>Constructor</strong></p><pre><code class="nohighlight hljs">Annotation(id, recording_id, start, duration, channel, data)
Annotation(id, recording_id[; channel = missing, start = -1, duration = -1, data = missing)</code></pre><p>If <code>start</code> and/or <code>duration</code> are negative, the segment is considered to be the whole sequence length of the recording.</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/manifest_item.jl#L49-L71">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="AudioSources.load-Tuple{Recording}" href="#AudioSources.load-Tuple{Recording}"><code>AudioSources.load</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">load(recording::Recording [; start = -1, duration = -1, channels = recording.channels])
load(recording, annotation)</code></pre><p>Load the signal from a recording. <code>start</code>, <code>duration</code> (in seconds)</p><p>The function returns a tuple <code>(x, sr)</code> where <code>x</code> is a <span>$N×C$</span> array</p><ul><li><span>$N$</span> is the length of the signal and <span>$C$</span> is the number of channels</li><li>and <code>sr</code> is the sampling rate of the signal.</li></ul></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/manifest_item.jl#L85-L94">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="AudioSources.load-Tuple{Recording, Annotation}" href="#AudioSources.load-Tuple{Recording, Annotation}"><code>AudioSources.load</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">load(r::Recording, a::Annotation)
load(t::Tuple{Recording, Annotation})</code></pre><p>Load only a segment of the recording referenced in the annotation.</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/manifest_item.jl#L107-L111">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.load_manifest-Tuple{Type{<:Union{Annotation, Recording}}, Any}" href="#SpeechDatasets.load_manifest-Tuple{Type{<:Union{Annotation, Recording}}, Any}"><code>SpeechDatasets.load_manifest</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">load_manifest(Annotation, path)
load_manifest(Recording, path)</code></pre><p>Load Recording/Annotation manifest from <code>path</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/manifest_io.jl#L131-L136">source</a></section></article><h2 id="Lexicons"><a class="docs-heading-anchor" href="#Lexicons">Lexicons</a><a id="Lexicons-1"></a><a class="docs-heading-anchor-permalink" href="#Lexicons" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.CMUDICT-Tuple{Any}" href="#SpeechDatasets.CMUDICT-Tuple{Any}"><code>SpeechDatasets.CMUDICT</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">CMUDICT(path)</code></pre><p>Return the dictionary of pronunciation loaded from the CMU sphinx dictionary. The CMU dictionary will be donwloaded and stored into to <code>path</code>. Subsequent calls will only read the file <code>path</code> without downloading again the data.</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/lexicons.jl#L16-L22">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.TIMITDICT-Tuple{Any}" href="#SpeechDatasets.TIMITDICT-Tuple{Any}"><code>SpeechDatasets.TIMITDICT</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">TIMITDICT(timitdir)</code></pre><p>Return the dictionary of pronunciation as provided by TIMIT corpus (located in <code>timitdir</code>).</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/lexicons.jl#L47-L52">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="SpeechDatasets.MFAFRDICT-Tuple{Any}" href="#SpeechDatasets.MFAFRDICT-Tuple{Any}"><code>SpeechDatasets.MFAFRDICT</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">MFAFRDICT(path)</code></pre><p>Return the french dictionary of pronunciation as provided by MFA (french_mfa v2.0.0a).</p></div><a class="docs-sourcelink" target="_blank" href="https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/tree/18285e8aa78294c368f53f84647f2a8df4df2737/src/lexicons.jl#L76-L80">source</a></section></article><h2 id="Index"><a class="docs-heading-anchor" href="#Index">Index</a><a id="Index-1"></a><a class="docs-heading-anchor-permalink" href="#Index" title="Permalink"></a></h2><ul><li><a href="#SpeechDatasets.Annotation"><code>SpeechDatasets.Annotation</code></a></li><li><a href="#SpeechDatasets.ManifestItem"><code>SpeechDatasets.ManifestItem</code></a></li><li><a href="#SpeechDatasets.Recording"><code>SpeechDatasets.Recording</code></a></li><li><a href="#SpeechDatasets.SpeechDataset"><code>SpeechDatasets.SpeechDataset</code></a></li><li><a href="#AudioSources.load-Tuple{Recording, Annotation}"><code>AudioSources.load</code></a></li><li><a href="#AudioSources.load-Tuple{Recording}"><code>AudioSources.load</code></a></li><li><a href="#SpeechDatasets.CMUDICT-Tuple{Any}"><code>SpeechDatasets.CMUDICT</code></a></li><li><a href="#SpeechDatasets.MFAFRDICT-Tuple{Any}"><code>SpeechDatasets.MFAFRDICT</code></a></li><li><a href="#SpeechDatasets.TIMITDICT-Tuple{Any}"><code>SpeechDatasets.TIMITDICT</code></a></li><li><a href="#SpeechDatasets.dataset"><code>SpeechDatasets.dataset</code></a></li><li><a href="#SpeechDatasets.load_manifest-Tuple{Type{<:Union{Annotation, Recording}}, Any}"><code>SpeechDatasets.load_manifest</code></a></li></ul></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../examples/">« Examples</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.8.0 on <span class="colophon-date" title="Friday 14 February 2025 13:55">Friday 14 February 2025</span>. Using Julia version 1.10.8.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>