-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ie/youtube] Extract upload date timestamp if available #9856
Conversation
yt_dlp/extractor/youtube.py
Outdated
if not upload_date or ( | ||
live_status in ('not_live', None) | ||
not timestamp | ||
and live_status in ('not_live', None) | ||
and 'no-youtube-prefer-utc-upload-date' not in self.get_param('compat_opts', []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should this compat opt work now?
upload_date = (
dt.datetime.fromtimestamp(timestamp, dt.timezone.utc).strftime('%Y%m%d') if (timestamp and 'no-youtube-prefer-utc-upload-date' not in self.get_param('compat_opts', []) else
(
unified_strdate(get_first(microformats, 'uploadDate'))
or unified_strdate(search_meta('uploadDate'))
))
maintaining this compat opt is turning out to be a bit of a pain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe youtube-dl is also planning to implement timestamp
as UTC. @dirkf Can you confirm? In that case, we should just make the compat option a no-op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm. It also gets messy with the fact we will be changing the scheduled, live and past live streams / premieres upload_date to be utc now, since we can accurately get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my WIP extractor gets the timestamp when possible, currently believed to be always, but I need to go back and check the age-gate API data.
Per #9829 (comment), the meta fields in the video page as well as the hydration JSON are using ISO 8601 with TZ offset. Presumably the API fields will mirror the hydration JSON.
The example code that I posted included the idea of own_upload_date
to represent the cases where the automatic core upload_date
shouldn't be used (because it should be in Pacific time, not UTC) and the unresolved TODO: ...
to handle the calculation of upload_date
in Pacific time from a known timestamp
relative to UTC.
The problem is to know which actual TZ offset from (PST, PDT) applies in Mountain View, CA, for a particular UTC time. Even in Py >= 3.9 with zone_info
, the library docs say that there's no cross-platform solution to this sort of TZ processing. If it should be required to do that the code would need to include the DST rules for the Pacific TZ, for each year where the calculation has to be done.
I think that this really argues for "retiring" the compat option and hoping that all the data sources used by the extractor are now providing ISO 8601 data with TZ.
upload_date = ( | ||
dt.datetime.fromtimestamp(timestamp, dt.timezone.utc).strftime('%Y%m%d') if timestamp else | ||
( | ||
unified_strdate(get_first(microformats, 'uploadDate')) | ||
or unified_strdate(search_meta('uploadDate')) | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If timestamp
, don't set upload_date
at all. Core code will do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have the old upload_date fallback below. Additionally it helps us avoid having to rewrite all the below code depending on if timestamp is avail or not
Why a1af9ff? |
To ensure it is accurate in the event they remove the timezone (similar to what they had before), Just ensuring we don't end up wrong upload dates on videos where possible.
|
updated working YoutubeIE tests with timestamp |
Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
Made --compat-options no-youtube-prefer-utc-upload-date a noop and updated README to reflect that. |
IMPORTANT: PRs without the template will be CLOSED
Description of your pull request and other information
ADD DESCRIPTION HERE
resolves #9829
fixes #4962
TODO update youtube tests and more thorough testing
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?