Merge branch 'main' into patch-3

This commit is contained in:
tcely 2025-04-13 08:30:11 -04:00 committed by GitHub
commit 7d33cd8579
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
17 changed files with 657 additions and 175 deletions

View File

@ -63,7 +63,7 @@ directory will be a `video` and `audio` subdirectories. All media which only has
audio stream (such as music) will download to the `audio` directory. All media with a
video stream will be downloaded to the `video` directory. All administration of
TubeSync is performed via a web interface. You can optionally add a media server,
currently just Plex, to complete the PVR experience.
currently only Jellyfin or Plex, to complete the PVR experience.
# Installation
@ -221,7 +221,7 @@ As media is indexed and downloaded it will appear in the "media" tab.
### 3. Media Server updating
Currently TubeSync supports Plex as a media server. You can add your local Plex server
Currently TubeSync supports Plex and Jellyfin as media servers. You can add your local Jellyfin or Plex server
under the "media servers" tab.
@ -234,6 +234,13 @@ view these with:
$ docker logs --follow tubesync
```
To include logs with an issue report, please exteact a file and attach it to the issue.
The command below creates the `TubeSync.logs.txt` file with the logs from the `tubesync` container:
```bash
docker logs -t tubesync > TubeSync.logs.txt 2>&1
```
# Advanced usage guides
@ -250,7 +257,15 @@ and less common features:
# Warnings
### 1. Index frequency
### 1. Automated file renaming
> [!IMPORTANT]
> Currently, file renaming is not enabled by default.
> Enabling this feature by default is planned in an upcoming release, after `2025-006-01`.
>
> To prevent your installation from scheduling media file renaming tasks,
> you must set `TUBESYNC_RENAME_ALL_SOURCES=False` in the environment variables.
### 2. Index frequency
It's a good idea to add sources with as long of an index frequency as possible. This is
the duration between indexes of the source. An index is when TubeSync checks to see
@ -258,7 +273,7 @@ what videos available on a channel or playlist to find new media. Try and keep t
long as possible, up to 24 hours.
### 2. Indexing massive channels
### 3. Indexing massive channels
If you add a massive (several thousand videos) channel to TubeSync and choose "index
every hour" or similar short interval it's entirely possible your TubeSync install may
@ -372,21 +387,25 @@ There are a number of other environment variables you can set. These are, mostly
useful if you are manually installing TubeSync in some other environment. These are:
| Name | What | Example |
| ---------------------------- | ------------------------------------------------------------- |--------------------------------------|
| ---------------------------- | ------------------------------------------------------------- |-------------------------------------------------------------------------------|
| DJANGO_SECRET_KEY | Django's SECRET_KEY | YJySXnQLB7UVZw2dXKDWxI5lEZaImK6l |
| DJANGO_URL_PREFIX | Run TubeSync in a sub-URL on the web server | /somepath/ |
| TUBESYNC_DEBUG | Enable debugging | True |
| TUBESYNC_WORKERS | Number of background workers, default is 2, max allowed is 8 | 2 |
| TUBESYNC_HOSTS | Django's ALLOWED_HOSTS, defaults to `*` | tubesync.example.com,otherhost.com |
| TUBESYNC_RESET_DOWNLOAD_DIR | Toggle resetting `/downloads` permissions, defaults to True | True |
| TUBESYNC_VIDEO_HEIGHT_CUTOFF | Smallest video height in pixels permitted to download | 240 |
| TUBESYNC_RENAME_SOURCES | Rename media files from selected sources | Source1_directory,Source2_directory |
| TUBESYNC_RENAME_ALL_SOURCES | Rename media files from all sources | True |
| TUBESYNC_DIRECTORY_PREFIX | Enable `video` and `audio` directory prefixes in `/downloads` | True |
| GUNICORN_WORKERS | Number of gunicorn workers to spawn | 3 |
| LISTEN_HOST | IP address for gunicorn to listen on | 127.0.0.1 |
| LISTEN_PORT | Port number for gunicorn to listen on | 8080 |
| TUBESYNC_SHRINK_NEW | Filter unneeded information from newly retrieved metadata | True |
| TUBESYNC_SHRINK_OLD | Filter unneeded information from metadata loaded from the database | True |
| TUBESYNC_WORKERS | Number of background threads per (task runner) process. Default is 1. Max allowed is 8. | 2 |
| GUNICORN_WORKERS | Number of `gunicorn` (web request) workers to spawn | 3 |
| LISTEN_HOST | IP address for `gunicorn` to listen on | 127.0.0.1 |
| LISTEN_PORT | Port number for `gunicorn` to listen on | 8080 |
| HTTP_USER | Sets the username for HTTP basic authentication | some-username |
| HTTP_PASS | Sets the password for HTTP basic authentication | some-secure-password |
| DATABASE_CONNECTION | Optional external database connection details | mysql://user:pass@host:port/database |
| DATABASE_CONNECTION | Optional external database connection details | postgresql://user:pass@host:port/database |
# Manual, non-containerised, installation
@ -396,7 +415,7 @@ following this rough guide, you are on your own and should be knowledgeable abou
installing and running WSGI-based Python web applications before attempting this.
1. Clone or download this repo
2. Make sure you're running a modern version of Python (>=3.6) and have Pipenv
2. Make sure you're running a modern version of Python (>=3.9) and have Pipenv
installed
3. Set up the environment with `pipenv install`
4. Copy `tubesync/tubesync/local_settings.py.example` to

View File

@ -2,4 +2,5 @@
exec nice -n "${TUBESYNC_NICE:-1}" s6-setuidgid app \
/usr/bin/python3 /app/manage.py process_tasks \
--queue database
--queue database --duration 86400 \
--sleep "30.${RANDOM}"

View File

@ -2,4 +2,5 @@
exec nice -n "${TUBESYNC_NICE:-1}" s6-setuidgid app \
/usr/bin/python3 /app/manage.py process_tasks \
--queue filesystem
--queue filesystem --duration 43200 \
--sleep "20.${RANDOM}"

View File

@ -2,4 +2,5 @@
exec nice -n "${TUBESYNC_NICE:-1}" s6-setuidgid app \
/usr/bin/python3 /app/manage.py process_tasks \
--queue network
--queue network --duration 43200 \
--sleep "10.${RANDOM}"

26
tubesync/restart_services.sh Executable file
View File

@ -0,0 +1,26 @@
#!/usr/bin/env sh
dir='/run/service'
svc_path() (
cd "${dir}"
realpath -e -s "$@"
)
if [ 0 -eq $# ]
then
set -- \
$( cd "${dir}" && svc_path tubesync*-worker ) \
"$( svc_path gunicorn )" \
"$( svc_path nginx )"
fi
for service in $( svc_path "$@" )
do
printf -- 'Restarting %-28s' "${service#${dir}/}..."
_began="$( date '+%s' )"
/command/s6-svc -wr -r "${service}"
_ended="$( date '+%s' )"
printf -- '\tcompleted (in %2.1d seconds).\n' \
"$( expr "${_ended}" - "${_began}" )"
done
unset -v _began _ended service

View File

@ -2,19 +2,18 @@ import os
import uuid
from django.utils.translation import gettext_lazy as _
from django.core.management.base import BaseCommand, CommandError
from django.db.models import signals
from django.db.transaction import atomic
from common.logger import log
from sync.models import Source, Media, MediaServer
from sync.signals import media_post_delete
from sync.tasks import schedule_media_servers_update
class Command(BaseCommand):
help = ('Deletes a source by UUID')
help = _('Deletes a source by UUID')
def add_arguments(self, parser):
parser.add_argument('--source', action='store', required=True, help='Source UUID')
parser.add_argument('--source', action='store', required=True, help=_('Source UUID'))
def handle(self, *args, **options):
source_uuid_str = options.get('source', '')
@ -30,11 +29,13 @@ class Command(BaseCommand):
raise CommandError(f'Source does not exist with '
f'UUID: {source_uuid}')
# Reconfigure the source to not update the disk or media servers
with atomic(durable=True):
source.deactivate()
# Delete the source, triggering pre-delete signals for each media item
log.info(f'Found source with UUID "{source.uuid}" with name '
f'"{source.name}" and deleting it, this may take some time!')
log.info(f'Source directory: {source.directory_path}')
with atomic(durable=True):
source.delete()
# Update any media servers
schedule_media_servers_update()

View File

@ -3,7 +3,7 @@ from django.db.transaction import atomic
from django.utils.translation import gettext_lazy as _
from background_task.models import Task
from sync.models import Source
from sync.tasks import index_source_task
from sync.tasks import index_source_task, check_source_directory_exists
from common.logger import log
@ -13,21 +13,28 @@ class Command(BaseCommand):
help = 'Resets all tasks'
@atomic(durable=True)
def handle(self, *args, **options):
log.info('Resettings all tasks...')
with atomic(durable=True):
# Delete all tasks
Task.objects.all().delete()
# Iter all tasks
# Iter all sources, creating new tasks
for source in Source.objects.all():
verbose_name = _('Check download directory exists for source "{}"')
check_source_directory_exists(
str(source.pk),
verbose_name=verbose_name.format(source.name),
)
# Recreate the initial indexing task
log.info(f'Resetting tasks for source: {source}')
verbose_name = _('Index media from source "{}"')
index_source_task(
str(source.pk),
repeat=source.index_schedule,
verbose_name=verbose_name.format(source.name)
verbose_name=verbose_name.format(source.name),
)
with atomic(durable=True):
for source in Source.objects.all():
# This also chains down to call each Media objects .save() as well
source.save()
log.info('Done')

View File

@ -5,6 +5,7 @@ from django.forms import ValidationError
from urllib.parse import urlsplit, urlunsplit, urlencode
from django.utils.translation import gettext_lazy as _
from common.logger import log
from django.conf import settings
class MediaServerError(Exception):
@ -18,14 +19,52 @@ class MediaServer:
TIMEOUT = 0
HELP = ''
default_headers = {'User-Agent': 'TubeSync'}
def __init__(self, mediaserver_instance):
self.object = mediaserver_instance
self.headers = dict(**self.default_headers)
self.token = None
def make_request_args(self, uri='/', token_header=None, headers={}, token_param=None, params={}):
base_parts = urlsplit(self.object.url)
if self.token is None:
self.token = self.object.loaded_options['token'] or None
if token_header and self.token:
headers.update({token_header: self.token})
self.headers.update(headers)
if token_param and self.token:
params.update({token_param: self.token})
qs = urlencode(params)
enable_verify = (
base_parts.scheme.endswith('s') and
self.object.verify_https
)
url = urlunsplit((base_parts.scheme, base_parts.netloc, uri, qs, ''))
return (url, dict(
headers=self.headers,
verify=enable_verify,
timeout=self.TIMEOUT,
))
def make_request(self, uri='/', /, *, headers={}, params={}):
'''
A very simple implementation is:
url, kwargs = self.make_request_args(uri=uri, headers=headers, params=params)
return requests.get(url, **kwargs)
'''
raise NotImplementedError('MediaServer.make_request() must be implemented')
def validate(self):
'''
Called to check that the configured media server values are correct.
'''
raise NotImplementedError('MediaServer.validate() must be implemented')
def update(self):
'''
Called after the `Media` instance has saved a downloaded file.
'''
raise NotImplementedError('MediaServer.update() must be implemented')
@ -48,30 +87,22 @@ class PlexMediaServer(MediaServer):
'<a href="https://www.plexopedia.com/plex-media-server/api/server/libraries/" '
'target="_blank">here</a></p>.')
def make_request(self, uri='/', params={}):
headers = {'User-Agent': 'TubeSync'}
token = self.object.loaded_options['token']
params['X-Plex-Token'] = token
base_parts = urlsplit(self.object.url)
qs = urlencode(params)
url = urlunsplit((base_parts.scheme, base_parts.netloc, uri, qs, ''))
if self.object.verify_https:
def make_request(self, uri='/', /, *, headers={}, params={}):
url, kwargs = self.make_request_args(uri=uri, headers=headers, token_param='X-Plex-Token', params=params)
log.debug(f'[plex media server] Making HTTP GET request to: {url}')
return requests.get(url, headers=headers, verify=True,
timeout=self.TIMEOUT)
else:
if self.object.use_https and not kwargs['verify']:
# If not validating SSL, given this is likely going to be for an internal
# or private network, that Plex issues certs *.hash.plex.direct and that
# the warning won't ever been sensibly seen in the HTTPS logs, hide it
with warnings.catch_warnings():
warnings.simplefilter("ignore")
return requests.get(url, headers=headers, verify=False,
timeout=self.TIMEOUT)
return requests.get(url, **kwargs)
return requests.get(url, **kwargs)
def validate(self):
'''
A Plex server requires a host, port, access token and a comma-separated
list if library IDs.
list of library IDs.
'''
# Check all the required values are present
if not self.object.host:
@ -172,19 +203,47 @@ class JellyfinMediaServer(MediaServer):
HELP = _('<p>To connect your TubeSync server to your Jellyfin Media Server, please enter the details below.</p>'
'<p>The <strong>host</strong> can be either an IP address or a valid hostname.</p>'
'<p>The <strong>port</strong> should be between 1 and 65536.</p>'
'<p>The <strong>token</strong> is required for API access. You can generate a token in your Jellyfin user profile settings.</p>'
'<p>The <strong>libraries</strong> is a comma-separated list of library IDs in Jellyfin.</p>')
'<p>The "API Key" <strong>token</strong> is required for API access. Your Jellyfin administrator can generate an "API Key" token for use with TubeSync for you.</p>'
'<p>The <strong>libraries</strong> is a comma-separated list of library IDs in Jellyfin. Leave this blank to see a list.</p>')
def make_request(self, uri='/', params={}):
headers = {
'User-Agent': 'TubeSync',
'X-Emby-Token': self.object.loaded_options['token'] # Jellyfin uses the same `X-Emby-Token` header as Emby
}
def make_request(self, uri='/', /, *, headers={}, params={}, data={}, json=None, method='GET'):
assert method in {'GET', 'POST'}, f'Unimplemented method: {method}'
url = f'{self.object.url}{uri}'
log.debug(f'[jellyfin media server] Making HTTP GET request to: {url}')
headers.update({'Content-Type': 'application/json'})
url, kwargs = self.make_request_args(uri=uri, token_header='X-Emby-Token', headers=headers, params=params)
# From the Emby source code;
# this is the order in which the headers are tried:
# X-Emby-Authorization: ('MediaBrowser'|'Emby') 'Token'=<token_value>, 'Client'=<client_value>, 'Version'=<version_value>
# X-Emby-Token: <token_value>
# X-MediaBrowser-Token: <token_value>
# Jellyfin uses 'Authorization' first,
# then optionally falls back to the 'X-Emby-Authorization' header.
# Jellyfin uses (") around values, but not keys in that header.
token = kwargs['headers'].get('X-Emby-Token', None)
if token:
kwargs['headers'].update({
'X-MediaBrowser-Token': token,
'X-Emby-Authorization': f'Emby Token={token}, Client=TubeSync, Version={settings.VERSION}',
'Authorization': f'MediaBrowser Token="{token}", Client="TubeSync", Version="{settings.VERSION}"',
})
return requests.get(url, headers=headers, verify=self.object.verify_https, timeout=self.TIMEOUT)
log.debug(f'[jellyfin media server] Making HTTP {method} request to: {url}')
if self.object.use_https and not kwargs['verify']:
# not verifying certificates
with warnings.catch_warnings():
warnings.simplefilter("ignore")
return requests.request(
method, url,
data=data,
json=json,
**kwargs,
)
return requests.request(
method, url,
data=data,
json=json,
**kwargs,
)
def validate(self):
if not self.object.host:
@ -245,8 +304,8 @@ class JellyfinMediaServer(MediaServer):
def update(self):
libraries = self.object.loaded_options.get('libraries', '').split(',')
for library_id in map(str.strip, libraries):
uri = f'/Library/{library_id}/Refresh'
response = self.make_request(uri)
uri = f'/Items/{library_id}/Refresh'
response = self.make_request(uri, method='POST')
if response.status_code != 204: # 204 No Content is expected for successful refresh
raise MediaServerError(f'Failed to refresh Jellyfin library "{library_id}", status code: {response.status_code}')
return True

View File

@ -0,0 +1,52 @@
# Generated by Django 5.1.8 on 2025-04-11 07:36
import django.db.models.deletion
import sync.models
import uuid
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('sync', '0030_alter_source_source_vcodec'),
]
operations = [
migrations.CreateModel(
name='Metadata',
fields=[
('uuid', models.UUIDField(default=uuid.uuid4, editable=False, help_text='UUID of the metadata', primary_key=True, serialize=False, verbose_name='uuid')),
('site', models.CharField(blank=True, default='Youtube', help_text='Site from which the metadata was retrieved', max_length=256, verbose_name='site')),
('key', models.CharField(blank=True, default='', help_text='Media identifier at the site from which the metadata was retrieved', max_length=256, verbose_name='key')),
('created', models.DateTimeField(auto_now_add=True, db_index=True, help_text='Date and time the metadata was created', verbose_name='created')),
('retrieved', models.DateTimeField(auto_now_add=True, db_index=True, help_text='Date and time the metadata was retrieved', verbose_name='retrieved')),
('uploaded', models.DateTimeField(help_text='Date and time the media was uploaded', null=True, verbose_name='uploaded')),
('published', models.DateTimeField(help_text='Date and time the media was published', null=True, verbose_name='published')),
('value', models.JSONField(default=dict, encoder=sync.models.JSONEncoder, help_text='JSON metadata object', verbose_name='value')),
('media', models.ForeignKey(help_text='Media the metadata belongs to', on_delete=django.db.models.deletion.CASCADE, related_name='metadata_media', to='sync.media')),
],
options={
'verbose_name': 'Metadata about a Media item',
'verbose_name_plural': 'Metadata about a Media item',
'unique_together': {('media', 'site', 'key')},
},
),
migrations.CreateModel(
name='MetadataFormat',
fields=[
('uuid', models.UUIDField(default=uuid.uuid4, editable=False, help_text='UUID of the format', primary_key=True, serialize=False, verbose_name='uuid')),
('site', models.CharField(blank=True, default='Youtube', help_text='Site from which the format is available', max_length=256, verbose_name='site')),
('key', models.CharField(blank=True, default='', help_text='Media identifier at the site for which this format is available', max_length=256, verbose_name='key')),
('number', models.PositiveIntegerField(help_text='Ordering number for this format', verbose_name='number')),
('code', models.CharField(blank=True, default='', help_text='Format identification code', max_length=64, verbose_name='code')),
('value', models.JSONField(default=dict, encoder=sync.models.JSONEncoder, help_text='JSON metadata format object', verbose_name='value')),
('metadata', models.ForeignKey(help_text='Metadata the format belongs to', on_delete=django.db.models.deletion.CASCADE, related_name='metadataformat_metadata', to='sync.metadata')),
],
options={
'verbose_name': 'Format from the Metadata about a Media item',
'verbose_name_plural': 'Formats from the Metadata about a Media item',
'unique_together': {('metadata', 'site', 'key', 'code'), ('metadata', 'site', 'key', 'number')},
},
),
]

View File

@ -11,7 +11,9 @@ from django.conf import settings
from django.db import models
from django.core.exceptions import SuspiciousOperation
from django.core.files.storage import FileSystemStorage
from django.core.serializers.json import DjangoJSONEncoder
from django.core.validators import RegexValidator
from django.db.transaction import atomic
from django.utils.text import slugify
from django.utils import timezone
from django.utils.translation import gettext_lazy as _
@ -35,6 +37,20 @@ from .choices import (Val, CapChoices, Fallback, FileExtension,
media_file_storage = FileSystemStorage(location=str(settings.DOWNLOAD_ROOT), base_url='/media-data/')
_srctype_dict = lambda n: dict(zip( YouTube_SourceType.values, (n,) * len(YouTube_SourceType.values) ))
class JSONEncoder(DjangoJSONEncoder):
item_separator = ','
key_separator = ':'
def default(self, obj):
try:
iterable = iter(obj)
except TypeError:
pass
else:
return list(iterable)
return super().default(obj)
class Source(models.Model):
'''
A Source is a source of media. Currently, this is either a YouTube channel
@ -833,11 +849,14 @@ class Media(models.Model):
fields = self.METADATA_FIELDS.get(field, {})
return fields.get(self.source.source_type, field)
def get_metadata_first_value(self, iterable, default=None, /):
def get_metadata_first_value(self, iterable, default=None, /, *, arg_dict=None):
'''
fetch the first key with a value from metadata
'''
if arg_dict is None:
arg_dict = self.loaded_metadata
assert isinstance(arg_dict, dict), type(arg_dict)
# str is an iterable of characters
# we do not want to look for each character!
if isinstance(iterable, str):
@ -845,7 +864,7 @@ class Media(models.Model):
for key in tuple(iterable):
# reminder: unmapped fields return the key itself
field = self.get_metadata_field(key)
value = self.loaded_metadata.get(field)
value = arg_dict.get(field)
# value can be None because:
# - None was stored at the key
# - the key was not in the dictionary
@ -1079,6 +1098,24 @@ class Media(models.Model):
return self.metadata is not None
@atomic(durable=False)
def metadata_load(self, arg_str='{}'):
data = json.loads(arg_str) or self.loaded_metadata
site = self.get_metadata_first_value('extractor_key', arg_dict=data)
epoch = self.get_metadata_first_value('epoch', arg_dict=data)
epoch_dt = self.metadata_published( epoch )
release = self.get_metadata_first_value(('release_timestamp', 'timestamp',), arg_dict=data)
release_dt = self.metadata_published( release )
md = self.metadata_media.get_or_create(site=site, key=self.key)[0]
md.value = data
formats = md.value.pop(self.get_metadata_field('formats'), list())
md.retrieved = epoch_dt
md.uploaded = self.published
md.published = release_dt or self.published
md.save()
md.ingest_formats(formats)
def save_to_metadata(self, key, value, /):
data = self.loaded_metadata
data[key] = value
@ -1681,6 +1718,152 @@ class Media(models.Model):
pass
class Metadata(models.Model):
'''
Metadata for an indexed `Media` item.
'''
class Meta:
verbose_name = _('Metadata about a Media item')
verbose_name_plural = _('Metadata about a Media item')
unique_together = (
('media', 'site', 'key'),
)
uuid = models.UUIDField(
_('uuid'),
primary_key=True,
editable=False,
default=uuid.uuid4,
help_text=_('UUID of the metadata'),
)
media = models.ForeignKey(
Media,
# on_delete=models.DO_NOTHING,
on_delete=models.CASCADE,
related_name='metadata_media',
help_text=_('Media the metadata belongs to'),
null=False,
)
site = models.CharField(
_('site'),
max_length=256,
blank=True,
null=False,
default='Youtube',
help_text=_('Site from which the metadata was retrieved'),
)
key = models.CharField(
_('key'),
max_length=256,
blank=True,
null=False,
default='',
help_text=_('Media identifier at the site from which the metadata was retrieved'),
)
created = models.DateTimeField(
_('created'),
auto_now_add=True,
db_index=True,
help_text=_('Date and time the metadata was created'),
)
retrieved = models.DateTimeField(
_('retrieved'),
auto_now_add=True,
db_index=True,
help_text=_('Date and time the metadata was retrieved'),
)
uploaded = models.DateTimeField(
_('uploaded'),
null=True,
help_text=_('Date and time the media was uploaded'),
)
published = models.DateTimeField(
_('published'),
null=True,
help_text=_('Date and time the media was published'),
)
value = models.JSONField(
_('value'),
encoder=JSONEncoder,
null=False,
default=dict,
help_text=_('JSON metadata object'),
)
@atomic(durable=False)
def ingest_formats(self, formats=list(), /):
for number, format in enumerate(formats, start=1):
mdf = self.metadataformat_metadata.get_or_create(site=self.site, key=self.key, code=format.get('format_id'), number=number)[0]
mdf.value = format
mdf.save()
class MetadataFormat(models.Model):
'''
A format from the Metadata for an indexed `Media` item.
'''
class Meta:
verbose_name = _('Format from the Metadata about a Media item')
verbose_name_plural = _('Formats from the Metadata about a Media item')
unique_together = (
('metadata', 'site', 'key', 'number'),
('metadata', 'site', 'key', 'code'),
)
uuid = models.UUIDField(
_('uuid'),
primary_key=True,
editable=False,
default=uuid.uuid4,
help_text=_('UUID of the format'),
)
metadata = models.ForeignKey(
Metadata,
# on_delete=models.DO_NOTHING,
on_delete=models.CASCADE,
related_name='metadataformat_metadata',
help_text=_('Metadata the format belongs to'),
null=False,
)
site = models.CharField(
_('site'),
max_length=256,
blank=True,
null=False,
default='Youtube',
help_text=_('Site from which the format is available'),
)
key = models.CharField(
_('key'),
max_length=256,
blank=True,
null=False,
default='',
help_text=_('Media identifier at the site for which this format is available'),
)
number = models.PositiveIntegerField(
_('number'),
blank=False,
null=False,
help_text=_('Ordering number for this format')
)
code = models.CharField(
_('code'),
max_length=64,
blank=True,
null=False,
default='',
help_text=_('Format identification code'),
)
value = models.JSONField(
_('value'),
encoder=JSONEncoder,
null=False,
default=dict,
help_text=_('JSON metadata format object'),
)
class MediaServer(models.Model):
'''
A remote media server, such as a Plex server.

View File

@ -1,8 +1,9 @@
from functools import partial
from pathlib import Path
from shutil import rmtree
from tempfile import TemporaryDirectory
from django.conf import settings
from django.db.models.signals import pre_save, post_save, pre_delete, post_delete
from django.db.transaction import on_commit
from django.dispatch import receiver
from django.utils.translation import gettext_lazy as _
from background_task.signals import task_failed
@ -20,6 +21,20 @@ from .filtering import filter_media
from .choices import Val, YouTube_SourceType
def is_relative_to(self, *other):
"""Return True if the path is relative to another path or False.
"""
try:
self.relative_to(*other)
return True
except ValueError:
return False
# patch Path for Python 3.8
if not hasattr(Path, 'is_relative_to'):
Path.is_relative_to = is_relative_to
@receiver(pre_save, sender=Source)
def source_pre_save(sender, instance, **kwargs):
# Triggered before a source is saved, if the schedule has been updated recreate
@ -134,6 +149,7 @@ def source_post_save(sender, instance, created, **kwargs):
def source_pre_delete(sender, instance, **kwargs):
# Triggered before a source is deleted, delete all media objects to trigger
# the Media models post_delete signal
source = instance
log.info(f'Deactivating source: {instance.name}')
instance.deactivate()
log.info(f'Deleting tasks for source: {instance.name}')
@ -141,20 +157,22 @@ def source_pre_delete(sender, instance, **kwargs):
delete_task_by_source('sync.tasks.check_source_directory_exists', instance.pk)
delete_task_by_source('sync.tasks.rename_all_media_for_source', instance.pk)
delete_task_by_source('sync.tasks.save_all_media_for_source', instance.pk)
# Fetch the media source
sqs = Source.objects.filter(filter_text=str(source.pk))
if sqs.count():
media_source = sqs[0]
# Schedule deletion of media
delete_task_by_source('sync.tasks.delete_all_media_for_source', instance.pk)
delete_task_by_source('sync.tasks.delete_all_media_for_source', media_source.pk)
verbose_name = _('Deleting all media for source "{}"')
delete_all_media_for_source(
str(instance.pk),
str(instance.name),
verbose_name=verbose_name.format(instance.name),
)
# Try to do it all immediately
# If this is killed, the scheduled task should do the work instead.
delete_all_media_for_source.now(
str(instance.pk),
str(instance.name),
)
on_commit(partial(
delete_all_media_for_source,
str(media_source.pk),
str(media_source.name),
str(media_source.directory_path),
priority=1,
verbose_name=verbose_name.format(media_source.name),
))
@receiver(post_delete, sender=Source)
@ -164,14 +182,8 @@ def source_post_delete(sender, instance, **kwargs):
log.info(f'Deleting tasks for removed source: {source.name}')
delete_task_by_source('sync.tasks.index_source_task', instance.pk)
delete_task_by_source('sync.tasks.check_source_directory_exists', instance.pk)
delete_task_by_source('sync.tasks.delete_all_media_for_source', instance.pk)
delete_task_by_source('sync.tasks.rename_all_media_for_source', instance.pk)
delete_task_by_source('sync.tasks.save_all_media_for_source', instance.pk)
# Remove the directory, if the user requested that
directory_path = Path(source.directory_path)
if (directory_path / '.to_be_removed').is_file():
log.info(f'Deleting directory for: {source.name}: {directory_path}')
rmtree(directory_path, True)
@receiver(task_failed, sender=Task)
@ -250,8 +262,10 @@ def media_post_save(sender, instance, created, **kwargs):
if not instance.thumb and not instance.skip:
thumbnail_url = instance.thumbnail
if thumbnail_url:
log.info(f'Scheduling task to download thumbnail for: {instance.name} '
f'from: {thumbnail_url}')
log.info(
'Scheduling task to download thumbnail'
f' for: {instance.name} from: {thumbnail_url}'
)
verbose_name = _('Downloading thumbnail for "{}"')
download_media_thumbnail(
str(instance.pk),
@ -289,8 +303,10 @@ def media_pre_delete(sender, instance, **kwargs):
delete_task_by_media('sync.tasks.wait_for_media_premiere', (str(instance.pk),))
thumbnail_url = instance.thumbnail
if thumbnail_url:
delete_task_by_media('sync.tasks.download_media_thumbnail',
(str(instance.pk), thumbnail_url))
delete_task_by_media(
'sync.tasks.download_media_thumbnail',
(str(instance.pk), thumbnail_url,),
)
# Remove thumbnail file for deleted media
if instance.thumb:
instance.thumb.delete(save=False)

View File

@ -10,13 +10,14 @@ import math
import uuid
from io import BytesIO
from hashlib import sha1
from pathlib import Path
from datetime import datetime, timedelta
from shutil import copyfile
from shutil import copyfile, rmtree
from PIL import Image
from django.conf import settings
from django.core.files.base import ContentFile
from django.core.files.uploadedfile import SimpleUploadedFile
from django.db import connection, DatabaseError, IntegrityError
from django.db import connection, reset_queries, DatabaseError, IntegrityError
from django.db.transaction import atomic
from django.utils import timezone
from django.utils.translation import gettext_lazy as _
@ -24,12 +25,13 @@ from background_task import background
from background_task.exceptions import InvalidTaskError
from background_task.models import Task, CompletedTask
from common.logger import log
from common.errors import NoMediaException, NoMetadataException, DownloadFailedException
from common.errors import ( NoFormatException, NoMediaException,
NoMetadataException, DownloadFailedException, )
from common.utils import json_serial, remove_enclosed
from .choices import Val, TaskQueue
from .models import Source, Media, MediaServer
from .utils import (get_remote_image, resize_image_to_height, delete_file,
write_text_file, filter_response)
from .utils import ( get_remote_image, resize_image_to_height, delete_file,
write_text_file, filter_response, )
from .youtube import YouTubeError
@ -54,7 +56,7 @@ def map_task_to_instance(task):
'sync.tasks.download_media': Media,
'sync.tasks.download_media_metadata': Media,
'sync.tasks.save_all_media_for_source': Source,
'sync.tasks.refesh_formats': Media,
'sync.tasks.refresh_formats': Media,
'sync.tasks.rename_media': Media,
'sync.tasks.rename_all_media_for_source': Source,
'sync.tasks.wait_for_media_premiere': Media,
@ -121,7 +123,6 @@ def update_task_status(task, status):
else:
task.verbose_name = f'[{status}] {task._verbose_name}'
try:
with atomic():
task.save(update_fields={'verbose_name'})
except DatabaseError as e:
if 'Save with update_fields did not affect any rows.' == str(e):
@ -210,17 +211,15 @@ def save_model(instance):
instance.save()
@atomic(durable=False)
def schedule_media_servers_update():
with atomic():
# Schedule a task to update media servers
log.info(f'Scheduling media server updates')
verbose_name = _('Request media server rescan for "{}"')
for mediaserver in MediaServer.objects.all():
rescan_media_server(
str(mediaserver.pk),
priority=10,
verbose_name=verbose_name.format(mediaserver),
remove_existing_tasks=True,
)
@ -228,7 +227,13 @@ def cleanup_old_media():
with atomic():
for source in Source.objects.filter(delete_old_media=True, days_to_keep__gt=0):
delta = timezone.now() - timedelta(days=source.days_to_keep)
for media in source.media_source.filter(downloaded=True, download_date__lt=delta):
mqs = source.media_source.defer(
'metadata',
).filter(
downloaded=True,
download_date__lt=delta,
)
for media in mqs:
log.info(f'Deleting expired media: {source} / {media} '
f'(now older than {source.days_to_keep} days / '
f'download_date before {delta})')
@ -242,8 +247,12 @@ def cleanup_removed_media(source, videos):
if not source.delete_removed_media:
return
log.info(f'Cleaning up media no longer in source: {source}')
media_objects = Media.objects.filter(source=source)
for media in media_objects:
mqs = Media.objects.defer(
'metadata',
).filter(
source=source,
)
for media in mqs:
matching_source_item = [video['id'] for video in videos if video['id'] == media.key]
if not matching_source_item:
log.info(f'{media.name} is no longer in source, removing')
@ -252,11 +261,12 @@ def cleanup_removed_media(source, videos):
schedule_media_servers_update()
@background(schedule=dict(priority=10, run_at=30), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
@background(schedule=dict(priority=20, run_at=30), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
def index_source_task(source_id):
'''
Indexes media available from a Source object.
'''
reset_queries()
cleanup_completed_tasks()
# deleting expired media should happen any time an index task is requested
cleanup_old_media()
@ -330,7 +340,6 @@ def index_source_task(source_id):
verbose_name = _('Downloading metadata for "{}"')
download_media_metadata(
str(media.pk),
priority=20,
verbose_name=verbose_name.format(media.pk),
)
# Reset task.verbose_name to the saved value
@ -358,7 +367,7 @@ def check_source_directory_exists(source_id):
source.make_directory()
@background(schedule=dict(priority=5, run_at=10), queue=Val(TaskQueue.NET))
@background(schedule=dict(priority=10, run_at=10), queue=Val(TaskQueue.NET))
def download_source_images(source_id):
'''
Downloads an image and save it as a local thumbnail attached to a
@ -408,7 +417,7 @@ def download_source_images(source_id):
log.info(f'Thumbnail downloaded for source with ID: {source_id} / {source}')
@background(schedule=dict(priority=20, run_at=60), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
@background(schedule=dict(priority=40, run_at=60), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
def download_media_metadata(media_id):
'''
Downloads the metadata for a media item.
@ -492,7 +501,7 @@ def download_media_metadata(media_id):
f'{source} / {media}: {media_id}')
@background(schedule=dict(priority=15, run_at=10), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
@background(schedule=dict(priority=10, run_at=10), queue=Val(TaskQueue.FS), remove_existing_tasks=True)
def download_media_thumbnail(media_id, url):
'''
Downloads an image from a URL and save it as a local thumbnail attached to a
@ -530,7 +539,7 @@ def download_media_thumbnail(media_id, url):
return True
@background(schedule=dict(priority=15, run_at=60), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
@background(schedule=dict(priority=30, run_at=60), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
def download_media(media_id):
'''
Downloads the media to disk and attaches it to the Media instance.
@ -576,9 +585,36 @@ def download_media(media_id):
f'not downloading')
return
filepath = media.filepath
container = format_str = None
log.info(f'Downloading media: {media} (UUID: {media.pk}) to: "{filepath}"')
try:
format_str, container = media.download_media()
if os.path.exists(filepath):
except NoFormatException as e:
# Try refreshing formats
if media.has_metadata:
log.debug(f'Scheduling a task to refresh metadata for: {media.key}: "{media.name}"')
refresh_formats(
str(media.pk),
verbose_name=f'Refreshing metadata formats for: {media.key}: "{media.name}"',
)
log.exception(str(e))
raise
else:
if not os.path.exists(filepath):
# Try refreshing formats
if media.has_metadata:
log.debug(f'Scheduling a task to refresh metadata for: {media.key}: "{media.name}"')
refresh_formats(
str(media.pk),
verbose_name=f'Refreshing metadata formats for: {media.key}: "{media.name}"',
)
# Expected file doesn't exist on disk
err = (f'Failed to download media: {media} (UUID: {media.pk}) to disk, '
f'expected outfile does not exist: {filepath}')
log.error(err)
# Raising an error here triggers the task to be re-attempted (or fail)
raise DownloadFailedException(err)
# Media has been downloaded successfully
log.info(f'Successfully downloaded media: {media} (UUID: {media.pk}) to: '
f'"{filepath}"')
@ -640,16 +676,6 @@ def download_media(media_id):
pass
# Schedule a task to update media servers
schedule_media_servers_update()
else:
# Expected file doesn't exist on disk
err = (f'Failed to download media: {media} (UUID: {media.pk}) to disk, '
f'expected outfile does not exist: {filepath}')
log.error(err)
# Try refreshing formats
if media.has_metadata:
media.refresh_formats
# Raising an error here triggers the task to be re-attempted (or fail)
raise DownloadFailedException(err)
@background(schedule=dict(priority=0, run_at=30), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
@ -667,7 +693,7 @@ def rescan_media_server(mediaserver_id):
mediaserver.update()
@background(schedule=dict(priority=25, run_at=600), queue=Val(TaskQueue.FS), remove_existing_tasks=True)
@background(schedule=dict(priority=30, run_at=600), queue=Val(TaskQueue.FS), remove_existing_tasks=True)
def save_all_media_for_source(source_id):
'''
Iterates all media items linked to a source and saves them to
@ -675,6 +701,7 @@ def save_all_media_for_source(source_id):
source has its parameters changed and all media needs to be
checked to see if its download status has changed.
'''
reset_queries()
try:
source = Source.objects.get(pk=source_id)
except Source.DoesNotExist as e:
@ -684,15 +711,26 @@ def save_all_media_for_source(source_id):
raise InvalidTaskError(_('no such source')) from e
saved_later = set()
mqs = Media.objects.filter(source=source)
task = get_source_check_task(source_id)
refresh_qs = mqs.filter(
refresh_qs = Media.objects.all().only(
'pk',
'uuid',
'key',
'title', # for name property
).filter(
source=source,
can_download=False,
skip=False,
manual_skip=False,
downloaded=False,
metadata__isnull=False,
)
uuid_qs = Media.objects.all().only(
'pk',
'uuid',
).filter(
source=source,
).values_list('uuid', flat=True)
task = get_source_check_task(source_id)
if task:
task._verbose_name = remove_enclosed(
task.verbose_name, '[', ']', ' ',
@ -702,7 +740,7 @@ def save_all_media_for_source(source_id):
tvn_format = '1/{:,}' + f'/{refresh_qs.count():,}'
for mn, media in enumerate(refresh_qs, start=1):
update_task_status(task, tvn_format.format(mn))
refesh_formats(
refresh_formats(
str(media.pk),
verbose_name=f'Refreshing metadata formats for: {media.key}: "{media.name}"',
)
@ -710,17 +748,23 @@ def save_all_media_for_source(source_id):
# Trigger the post_save signal for each media item linked to this source as various
# flags may need to be recalculated
tvn_format = '2/{:,}' + f'/{mqs.count():,}'
for mn, media in enumerate(mqs, start=1):
if media.uuid not in saved_later:
tvn_format = '2/{:,}' + f'/{uuid_qs.count():,}'
for mn, media_uuid in enumerate(uuid_qs, start=1):
if media_uuid not in saved_later:
update_task_status(task, tvn_format.format(mn))
try:
media = Media.objects.get(pk=str(media_uuid))
except Media.DoesNotExist as e:
log.exception(str(e))
pass
else:
save_model(media)
# Reset task.verbose_name to the saved value
update_task_status(task, None)
@background(schedule=dict(priority=10, run_at=0), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
def refesh_formats(media_id):
@background(schedule=dict(priority=50, run_at=0), queue=Val(TaskQueue.NET), remove_existing_tasks=True)
def refresh_formats(media_id):
try:
media = Media.objects.get(pk=media_id)
except Media.DoesNotExist as e:
@ -765,7 +809,6 @@ def rename_all_media_for_source(source_id):
if not create_rename_tasks:
return
mqs = Media.objects.all().defer(
'metadata',
'thumb',
).filter(
source=source,
@ -799,8 +842,9 @@ def wait_for_media_premiere(media_id):
update_task_status(task, f'available in {hours(media.published - now)} hours')
save_model(media)
@background(schedule=dict(priority=1, run_at=300), queue=Val(TaskQueue.FS), remove_existing_tasks=False)
def delete_all_media_for_source(source_id, source_name):
@background(schedule=dict(priority=1, run_at=90), queue=Val(TaskQueue.FS), remove_existing_tasks=False)
def delete_all_media_for_source(source_id, source_name, source_directory):
source = None
try:
source = Source.objects.get(pk=source_id)
@ -814,8 +858,21 @@ def delete_all_media_for_source(source_id, source_name):
).filter(
source=source or source_id,
)
with atomic(durable=True):
for media in mqs:
log.info(f'Deleting media for source: {source_name} item: {media.name}')
with atomic():
media.delete()
# Remove the directory, if the user requested that
directory_path = Path(source_directory)
remove = (
(source and source.delete_removed_media) or
(directory_path / '.to_be_removed').is_file()
)
if source:
with atomic(durable=True):
source.delete()
if remove:
log.info(f'Deleting directory for: {source_name}: {directory_path}')
rmtree(directory_path, True)

View File

@ -99,6 +99,18 @@
</div>
</div>
</div>
<div class="row">
<div class="col s12">
<h2 class="truncate">Warnings</h2>
<div class="collection-item">
An upcoming release, after <b>2025-006-01</b>, will introduce automated file renaming.<br>
To prevent this change from taking effect, you can set an environment variable before that date.<br>
See the <a href="https://github.com/meeb/tubesync#warnings" rel="external noreferrer">GitHub README</a>
for more details or ask questions using
issue <a href="https://github.com/meeb/tubesync/issues/785" rel="external noreferrer">#785</a>.<br>
</div>
</div>
</div>
<div class="row">
<div class="col s12">
<h2 class="truncate">Runtime information</h2>

View File

@ -26,11 +26,12 @@ from .models import Source, Media, MediaServer
from .forms import (ValidateSourceForm, ConfirmDeleteSourceForm, RedownloadMediaForm,
SkipMediaForm, EnableMediaForm, ResetTasksForm,
ConfirmDeleteMediaServerForm)
from .utils import validate_url, delete_file, multi_key_sort
from .utils import validate_url, delete_file, multi_key_sort, mkdir_p
from .tasks import (map_task_to_instance, get_error_message,
get_source_completed_tasks, get_media_download_task,
delete_task_by_media, index_source_task, migrate_queues)
from .choices import (Val, MediaServerType, SourceResolution,
delete_task_by_media, index_source_task,
check_source_directory_exists, migrate_queues)
from .choices import (Val, MediaServerType, SourceResolution, IndexSchedule,
YouTube_SourceType, youtube_long_source_types,
youtube_help, youtube_validation_urls)
from . import signals
@ -410,11 +411,39 @@ class DeleteSourceView(DeleteView, FormMixin):
context_object_name = 'source'
def post(self, request, *args, **kwargs):
source = self.get_object()
media_source = dict(
uuid=None,
index_schedule=IndexSchedule.NEVER,
download_media=False,
index_videos=False,
index_streams=False,
filter_text=str(source.pk),
)
copy_fields = set(map(lambda f: f.name, source._meta.fields)) - set(media_source.keys())
for k, v in source.__dict__.items():
if k in copy_fields:
media_source[k] = v
media_source = Source(**media_source)
delete_media_val = request.POST.get('delete_media', False)
delete_media = True if delete_media_val is not False else False
# overload this boolean for our own use
media_source.delete_removed_media = delete_media
# adjust the directory and key on the source to be deleted
source.directory = source.directory + '/deleted'
source.key = source.key + '/deleted'
source.name = f'[Deleting] {source.name}'
source.save(update_fields={'directory', 'key', 'name'})
source.refresh_from_db()
# save the new media source now that it is not a duplicate
media_source.uuid = None
media_source.save()
media_source.refresh_from_db()
# switch the media to the new source instance
Media.objects.filter(source=source).update(source=media_source)
if delete_media:
source = self.get_object()
directory_path = pathlib.Path(source.directory_path)
directory_path = pathlib.Path(media_source.directory_path)
mkdir_p(directory_path)
(directory_path / '.to_be_removed').touch(exist_ok=True)
return super().post(request, *args, **kwargs)
@ -931,6 +960,11 @@ class ResetTasks(FormView):
Task.objects.all().delete()
# Iter all tasks
for source in Source.objects.all():
verbose_name = _('Check download directory exists for source "{}"')
check_source_directory_exists(
str(source.pk),
verbose_name=verbose_name.format(source.name),
)
# Recreate the initial indexing task
verbose_name = _('Index media from source "{}"')
index_source_task(

View File

@ -14,6 +14,7 @@ from tempfile import TemporaryDirectory
from urllib.parse import urlsplit, parse_qs
from django.conf import settings
from .choices import Val, FileExtension
from .hooks import postprocessor_hook, progress_hook
from .utils import mkdir_p
import yt_dlp
@ -204,10 +205,14 @@ def get_media_info(url, /, *, days=None, info_json=None):
'paths': paths,
'postprocessors': postprocessors,
'skip_unavailable_fragments': False,
'sleep_interval_requests': 2 * settings.BACKGROUND_TASK_ASYNC_THREADS,
'sleep_interval_requests': 1,
'verbose': True if settings.DEBUG else False,
'writeinfojson': True,
})
if settings.BACKGROUND_TASK_RUN_ASYNC:
opts.update({
'sleep_interval_requests': 2 * settings.BACKGROUND_TASK_ASYNC_THREADS,
})
if start:
log.debug(f'get_media_info: used date range: {opts["daterange"]} for URL: {url}')
response = {}
@ -301,6 +306,15 @@ def download_media(
).options.sponsorblock_mark
pp_opts.sponsorblock_remove.update(sponsor_categories or {})
# Enable audio extraction for audio-only extensions
audio_exts = set(Val(
FileExtension.M4A,
FileExtension.OGG,
))
if extension in audio_exts:
pp_opts.extractaudio = True
pp_opts.nopostoverwrites = False
ytopts = {
'format': media_format,
'merge_output_format': extension,

View File

@ -62,6 +62,8 @@ else:
DEFAULT_THREADS = 1
BACKGROUND_TASK_ASYNC_THREADS = getenv('TUBESYNC_WORKERS', DEFAULT_THREADS, integer=True)
if BACKGROUND_TASK_ASYNC_THREADS > 1:
BACKGROUND_TASK_RUN_ASYNC = True
MEDIA_ROOT = CONFIG_BASE_DIR / 'media'

View File

@ -7,7 +7,7 @@ CONFIG_BASE_DIR = BASE_DIR
DOWNLOADS_BASE_DIR = BASE_DIR
VERSION = '0.13.7'
VERSION = '0.14.1'
SECRET_KEY = ''
DEBUG = False
ALLOWED_HOSTS = []
@ -212,9 +212,6 @@ if MAX_RUN_TIME < 600:
DOWNLOAD_MEDIA_DELAY = 60 + (MAX_RUN_TIME / 50)
if RENAME_SOURCES or RENAME_ALL_SOURCES:
BACKGROUND_TASK_ASYNC_THREADS += 1
if BACKGROUND_TASK_ASYNC_THREADS > MAX_BACKGROUND_TASK_ASYNC_THREADS:
BACKGROUND_TASK_ASYNC_THREADS = MAX_BACKGROUND_TASK_ASYNC_THREADS