[BUG] - using hastag search seems cannot fetch all media data even using loop #1175

zhangzyg · 2024-07-27T15:17:45Z

I want to search ukraine related video in America region, but seems can only fetch 30-50 records. But checked in Tiktok, has 7.1M records, could we download all, or is there anyway to search by time range

My code snipet

async def search_videos_hashtag(hashtag, time_from, time_to, current_video_amount=0,
count=100, times=0) -> None:
global result, api, current_os, result_tik_id_set
format_style = '%m/%d/%y' if current_os == 'Windows' else '%Y/%m/%d'
sleep(random.Random().randint(a=3, b=5))
temp = 0
temp_video_amount = current_video_amount
if api is not None:
if len(api.sessions) == 0:
await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False) #ms_token is None
async for searchRes in api.hashtag(hashtag).videos(count=count, cursor=current_video_amount):
temp += 1
current_video_amount += 1
time_to_add_one_day = int((
datetime.fromtimestamp(format_str_timestamp(time_to, format_style)) +
timedelta(days=1)).timestamp())
if format_str_timestamp(time_from, format_style) <= searchRes.as_dict['createTime'] <= time_to_add_one_day
and searchRes.id not in result_tik_id_set:
author = construct_author_metadata(searchRes)
publish = construct_publish_metadata(searchRes)
author.append_publish(publish)
result.append(author)
result_tik_id_set.add(searchRes.id)
print('append one tik tok data, current search: ' + str(current_video_amount))
if temp_video_amount == current_video_amount:
sleep(random.Random().randint(a=3, b=5))
video_urls = list(map(lambda res: res.publish[0].link, result))
for url in video_urls:
await search_related_videos(url, time_from, time_to, required_video_amount=count,
current_video_amount=0,
count=int(count / len(video_urls)))
if temp < count and times < 100:
await search_videos_hashtag(hashtag, time_from, time_to, current_video_amount,
count, times=times + 1)

sameerahmedcls · 2024-07-31T21:35:45Z

can you send your full code

vagvalas · 2024-08-26T20:09:12Z

I can also confirm that this is a problem even before 6.4 (6.3.0) which could not pass beyond 45 videos.. now with 6.4 and later we can finally achieve a bigger amount (i had achieve 340 videos) but looping through the same videos again and again , and again (as the YouTube_dlp) which im passing the url fetched is constantly referring: already downloaded

here is my code:

from TikTokApi import TikTokApi
from yt_dlp import YoutubeDL
import asyncio
import os
from TikTokApi.exceptions import EmptyResponseException, TikTokException

ms_token = os.environ.get("multi_sids", "tjDG1O3i59WDpaK2v-spT5hmt1NcSJufT17v7cwvveTTqtYyq0N9mtAU-j76lfb7_msyycgSNt38AJVj2GF_KSxME27wc4C73eCVfSNsBs98TlO4PTOd2CEk7iRCm7kiFy7SPqKhUt33xvJ_LVtU")
ydl_opts = {
    'outtmpl': '%(uploader)s_%(id)s_%(timestamp)s.%(ext)s',
}

async def download_hashtag_videos(hashtag):
    async with TikTokApi() as api:
        try:
            await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3,
                                      headless=False, suppress_resource_load_types=["image", "media", "font", "stylesheet"])

            tag = api.hashtag(name=hashtag)
            more_videos = True
            while more_videos:
                videos = tag.videos(count=5000)
                video_list = []
                
                async for video in videos:
                    video_list.append(video)

                if not video_list:
                    more_videos = False
                    break

                for video in video_list:
                    print(f"Username: {video.author.username}")
                    print(f"Video ID: {video.id}")
                    print(f"Stats: {video.stats}")

                    video_url = f"https://www.tiktok.com/@{video.author.username}/video/{video.id}"
                    try:
                        with YoutubeDL(ydl_opts) as ydl:
                            ydl.download([video_url])
                    except Exception as e:
                        print(f"Error downloading video {video.id}: {e}")

        except EmptyResponseException as e:
            print(f"EmptyResponseException: {e}")
        except TikTokException as e:
            print(f"TikTokException: {e}")
        except Exception as e:
            print(f"An unexpected error occurred: {e}")

if __name__ == "__main__":
    hashtag = 'coldplayathens'
    asyncio.run(download_hashtag_videos(hashtag))    
   

TikTokApi: 6.5.2
Python 3.12
Playerlight: 1.39.00

vagvalas · 2024-08-26T20:15:21Z

Pass that it seems that it also fetched videos that it's not belong on the corresponding hashtag:
https://www.tiktok.com/@tashawishesyouluck/video/7407303973583015201

For example, and its not even on hashtag 'coldplayathens'

zhangzyg added the bug Something isn't working label Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] - using hastag search seems cannot fetch all media data even using loop #1175

[BUG] - using hastag search seems cannot fetch all media data even using loop #1175

zhangzyg commented Jul 27, 2024

sameerahmedcls commented Jul 31, 2024

vagvalas commented Aug 26, 2024 •

edited

Loading

vagvalas commented Aug 26, 2024

[BUG] - using hastag search seems cannot fetch all media data even using loop #1175

[BUG] - using hastag search seems cannot fetch all media data even using loop #1175

Comments

zhangzyg commented Jul 27, 2024

sameerahmedcls commented Jul 31, 2024

vagvalas commented Aug 26, 2024 • edited Loading

vagvalas commented Aug 26, 2024

vagvalas commented Aug 26, 2024 •

edited

Loading