Rewritten to use TailwindCSS and HTMX 🔥

This commit is contained in:
ForeverPyrite
2025-05-08 01:35:18 -04:00
parent 14b320bbde
commit b5a2b4e6d1
19 changed files with 590 additions and 571 deletions

View File

@@ -4,8 +4,8 @@ __pycache__
*.pyd
*.env
*venv/
*.git
.git
.gitignore
Dockerfile
docker-compose.yml
log.md
log.md

50
.vscode/launch.json vendored
View File

@@ -1,26 +1,26 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Flask",
"type": "debugpy",
"request": "launch",
"cwd": "./app",
"module": "flask",
"env": {
"FLASK_APP": "./app.py",
"FLASK_DEBUG": "1"
},
"args": [
"run",
"--debug",
"--no-reload"
],
"jinja": true,
"autoStartBrowser": false
}
]
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Flask",
"type": "debugpy",
"request": "launch",
"cwd": "./app",
"module": "flask",
"env": {
"FLASK_APP": "./app.py",
"FLASK_DEBUG": "1"
},
"args": [
"run",
"--debug",
"--no-reload"
],
"jinja": true,
"autoStartBrowser": false
}
]
}

View File

@@ -1,5 +1,5 @@
{
"html.autoClosingTags": true,
"html.format.enable": true,
"html.autoCreateQuotes": true
{
"html.autoClosingTags": true,
"html.format.enable": true,
"html.autoCreateQuotes": true
}

View File

@@ -1,5 +1,5 @@
# Use an official Python runtime as a parent image
FROM python:3.11-slim
FROM python:3.13-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
@@ -19,7 +19,7 @@ RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
# Copy application files
COPY . /app
COPY /app /app
# Make start.sh executable
RUN chmod +x start.sh

View File

@@ -1,18 +1,18 @@
This is simple web application that is made for a very specific purpose: to spite my 10th grade social studies teacher.
See, basically he didn't teach us anything, he wanted us to watch 10+ year old lectures on [his youtube channel](https://www.youtube.com/@mikebardonaro3227).
Yes, he's been doing this for so long that he can probably monatise his channel.
For each lecture, he also wanted us to take notes and create 5 Questions and Answers with a few critera.
Now here's what I thought about this:
If he isn't actually going to teach us the content of the class that I have to physically attend, then why should I do anything but match the effort he's putting forth.
So I made a Python script that took a video id and got the youtube transcript of it and then fed it to an OpenAI assistant to do it for me.
This was pretty pointless as you can also just copy the transcript and paste it into the assistant threads on OpenAI's playground platform but I still did it for the hell of it.
That got me through the year just fine.
However, the next year I had a friend who also got into his class, and instead of having him repeatedly ask me for my old work, I figured why not let him create some..."original" work himself?
So I spent a few nights developing a web application with a very simple task, and here it is.
It is some pretty bad code. Like, actually "minimum to make it work" code. However, I've decided to use this as an oppurtunity to still learn some things and hopefully be able to do more dedicated things.
I still occasionally revist it to try to make it a little better, and I might even scale up the website a bit and make it so that anyone can use it. Of course, this would come at a cost but I feel like it would be relatively deserved for the teacher after more than a dozen years of not doing anything.
This is simple web application that is made for a very specific purpose: to spite my 10th grade social studies teacher.
See, basically he didn't teach us anything, he wanted us to watch 10+ year old lectures on [his youtube channel](https://www.youtube.com/@mikebardonaro3227).
Yes, he's been doing this for so long that he can probably monatise his channel.
For each lecture, he also wanted us to take notes and create 5 Questions and Answers with a few critera.
Now here's what I thought about this:
If he isn't actually going to teach us the content of the class that I have to physically attend, then why should I do anything but match the effort he's putting forth.
So I made a Python script that took a video id and got the youtube transcript of it and then fed it to an OpenAI assistant to do it for me.
This was pretty pointless as you can also just copy the transcript and paste it into the assistant threads on OpenAI's playground platform but I still did it for the hell of it.
That got me through the year just fine.
However, the next year I had a friend who also got into his class, and instead of having him repeatedly ask me for my old work, I figured why not let him create some..."original" work himself?
So I spent a few nights developing a web application with a very simple task, and here it is.
It is some pretty bad code. Like, actually "minimum to make it work" code. However, I've decided to use this as an oppurtunity to still learn some things and hopefully be able to do more dedicated things.
I still occasionally revist it to try to make it a little better, and I might even scale up the website a bit and make it so that anyone can use it. Of course, this would come at a cost but I feel like it would be relatively deserved for the teacher after more than a dozen years of not doing anything.
If I ever make this repository public, judge the hell out of me. Just know that, unfortunately for everyone who would be looking for it, I never commit any hardcoded API keys, or `.env`...sorry.

View File

@@ -1,68 +1,90 @@
import logging
import os
from flask import Flask, render_template, Response, request, session
from main import yoink, process, user_streams, stream_lock
import uuid # Import UUID
app = Flask(__name__, static_folder="website/static", template_folder="website")
app.secret_key = os.urandom(24) # Necessary for using sessions
# Configure logging
logging.basicConfig(
filename='./logs/app.log',
level=logging.DEBUG,
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
def create_session():
session_id = str(uuid.uuid4())
# This should never happen but I'm putting the logic there anyways
try:
if user_streams[session_id]:
session_id = create_session()
except KeyError:
pass
return session_id
@app.route('/')
def home():
session_id = create_session()
session['id'] = session_id
logging.info(f"Home page accessed. Assigned initial session ID: {session_id}")
return render_template('index.html', session_id=session_id)
@app.route('/process_url', methods=['POST'])
def process_url():
session_id = session.get('id')
if not session_id:
session_id = create_session()
session['id'] = session_id
logging.info(f"No existing session. Created new session ID: {session_id}")
url = request.form['url']
logging.info(f"Received URL for processing from session {session_id}: {url}")
success, msg, status_code, = process(url, session_id)
if success:
logging.info(f"Processing started successfully for session {session_id}.")
return Response("Processing started. Check /stream_output for updates.", content_type='text/plain', status=200)
else:
logging.error(f"Processing failed for session {session_id}: {msg}")
return Response(msg, content_type='text/plain', status=status_code)
@app.route('/stream_output')
def stream_output():
session_id = session.get('id')
if not session_id or session_id not in user_streams:
logging.warning(f"Stream requested without a valid session ID: {session_id}")
return Response("No active stream for this session.", content_type='text/plain', status=400)
logging.info(f"Streaming output requested for session {session_id}.")
return Response(yoink(session_id), content_type='text/plain', status=200)
if __name__ == '__main__':
logging.info("Starting Flask application.")
app.run(debug=True, threaded=True) # Enable threaded to handle multiple requests
import logging
import os
import uuid
from flask import Flask, render_template, Response, request, session
from main import yoink, process, user_streams, stream_lock
app = Flask(__name__, static_folder="website/static", template_folder="website")
app.secret_key = os.urandom(24) # Necessary for using sessions
# Configure logging
logging.basicConfig(
filename='./logs/app.log',
level=logging.DEBUG,
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
def create_session():
"""
Create a new session by generating a UUID and ensuring it does not collide
with an existing session in the user_streams global dictionary.
Returns:
str: A unique session ID.
"""
session_id = str(uuid.uuid4())
# Even though collisions are unlikely, we check for safety.
try:
if user_streams[session_id]:
session_id = create_session()
except KeyError:
pass
return session_id
@app.route('/')
def home():
"""
Render the home page and initialize a session.
Returns:
Response: The rendered home page with a unique session id.
"""
session_id = create_session()
session['id'] = session_id
logging.info(f"Home page accessed. Assigned initial session ID: {session_id}")
return render_template('index.html', session_id=session_id)
@app.route('/process_url', methods=['POST'])
def process_url():
"""
Accept a YouTube URL (from a form submission), initialize the session if necessary,
and trigger the transcript retrieval and AI processing.
Returns:
Response: Text response indicating start or error message.
"""
session_id = session.get('id')
if not session_id:
session_id = create_session()
session['id'] = session_id
logging.info(f"No existing session. Created new session ID: {session_id}")
url = request.form['url']
logging.info(f"Received URL for processing from session {session_id}: {url}")
success, msg, status_code = process(url, session_id)
if success:
logging.info(f"Processing started successfully for session {session_id}.")
return Response("Processing started. Check /stream_output for updates.", content_type='text/plain', status=200)
else:
logging.error(f"Processing failed for session {session_id}: {msg}")
return Response(msg, content_type='text/plain', status=status_code)
@app.route('/stream_output')
def stream_output():
"""
Stream the AI processing output for the current session.
Returns:
Response: A streaming response with text/plain content.
"""
session_id = session.get('id')
if not session_id or session_id not in user_streams:
logging.warning(f"Stream requested without a valid session ID: {session_id}")
return Response("No active stream for this session.", content_type='text/plain', status=400)
logging.info(f"Streaming output requested for session {session_id}.")
return Response(yoink(session_id), content_type='text/plain', status=200)
if __name__ == '__main__':
logging.info("Starting Flask application.")
# Running with threaded=True to handle multiple requests concurrently.
app.run(debug=True, threaded=True)

View File

@@ -1,244 +1,331 @@
import re
import threading
import asyncio
from asyncio import sleep
from typing_extensions import override
from datetime import datetime
import pytz
import os
import logging
import uuid
# Youtube Transcript imports
import youtube_transcript_api._errors
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import TextFormatter
# OpenAI API imports
from openai import AssistantEventHandler
from openai import OpenAI
# Load environment variables
from dotenv import load_dotenv
load_dotenv()
# Initialize user stream dictionary
user_streams = {}
# Threading lock for thread safe stuff I think, idk it was used in the docs
stream_lock = threading.Lock()
# Handle async outside of async functions
awaiter = asyncio.run
# Configure logging
try:
logging.basicConfig(
filename='./logs/main.log',
level=logging.INFO,
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
except FileNotFoundError as e:
with open("./logs/main.log", "x"):
pass
logging.basicConfig(
filename='./logs/main.log',
level=logging.INFO,
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
logging.info(f"No main.log file was found ({e}), so one was created.")
# The StreamOutput class to handle streaming
class StreamOutput:
def __init__(self):
self.delta: str = ""
self.response: str = ""
self.done: bool = False
self.buffer: list = []
def reset(self):
self.delta = ""
self.response = ""
self.done = False
self.buffer = []
def send_delta(self, delta):
awaiter(self.process_delta(delta))
async def process_delta(self, delta):
self.delta = delta
self.response += delta
def get_index(lst):
if len(lst) == 0:
return 0
else:
return len(lst) - 1
if self.buffer:
try:
if self.delta != self.buffer[get_index(self.buffer)]:
self.buffer.append(delta)
except IndexError as index_error:
logging.error(f"Caught IndexError: {str(index_error)}")
self.buffer.append(delta)
else:
self.buffer.append(delta)
return
# OpenAI Config
# Setting up OpenAI Client with API Key
client = OpenAI(
organization='org-7ANUFsqOVIXLLNju8Rvmxu3h',
project="proj_NGz8Kux8CSka7DRJucAlDCz6",
api_key=os.getenv("OPENAI_API_KEY")
)
# Screw Bardo Assistant ID
asst_screw_bardo_id = "asst_JGFaX6uOIotqy5mIJnu3Yyp7"
# Event Handler for OpenAI Assistant
class EventHandler(AssistantEventHandler):
def __init__(self, output_stream: StreamOutput):
super().__init__()
self.output_stream = output_stream
@override
def on_text_created(self, text) -> None:
self.output_stream.send_delta("Response Received:\n\nScrew-Bardo:\n\n")
logging.info("Text created event handled.")
@override
def on_text_delta(self, delta, snapshot):
self.output_stream.send_delta(delta.value)
logging.debug(f"Text delta received: {delta.value}")
def on_tool_call_created(self, tool_call):
error_msg = "Assistant shouldn't be calling tools."
logging.error(error_msg)
raise Exception(error_msg)
def create_and_stream(transcript, session_id):
logging.info(f"Starting OpenAI stream thread for session {session_id}.")
event_handler = EventHandler(user_streams[session_id]['output_stream'])
try:
with client.beta.threads.create_and_run_stream(
assistant_id=asst_screw_bardo_id,
thread={
"messages": [{"role": "user", "content": transcript}]
},
event_handler=event_handler
) as stream:
stream.until_done()
with stream_lock:
user_streams[session_id]['output_stream'].done = True
logging.info(f"OpenAI stream completed for session {session_id}.")
except Exception as e:
logging.exception(f"Exception occurred during create_and_stream for session {session_id}.")
def yoink(session_id):
logging.info(f"Starting stream for session {session_id}...")
with stream_lock:
user_data = user_streams.get(session_id)
if not user_data:
logging.critical(f"User data not found for session id {session_id}?")
return # Session might have ended
output_stream: StreamOutput = user_data.get('output_stream')
thread: threading.Thread = user_data.get('thread')
thread.start()
while True:
if not output_stream or not thread:
logging.error(f"No output stream/thread for session {session_id}.\nThread: {thread.name if thread else "None"}")
break
if output_stream.done and not output_stream.buffer:
break
try:
if output_stream.buffer:
delta = output_stream.buffer.pop(0)
yield bytes(delta, encoding="utf-8")
else:
asyncio.run(sleep(0.018))
except Exception as e:
logging.exception(f"Exception occurred during streaming for session {session_id}: {e}")
break
logging.info(f"Stream completed successfully for session {session_id}.")
logging.info(f"Completed Assistant Response for session {session_id}:\n{output_stream.response}")
with stream_lock:
thread.join()
del user_streams[session_id]
logging.info(f"Stream thread joined and resources cleaned up for session {session_id}.")
def process(url, session_id):
# Should initialize the key in the dictionary
current_time = datetime.now(pytz.timezone('America/New_York')).strftime('%Y-%m-%d %H:%M:%S')
logging.info(f"New Entry at {current_time} for session {session_id}")
logging.info(f"URL: {url}")
video_id = get_video_id(url)
if not video_id:
logging.warning(f"Could not parse video id from URL: {url}")
return (False, "Couldn't parse video ID from URL. (Are you sure you entered a valid YouTube.com or YouTu.be URL?)", 400)
logging.info(f"Parsed Video ID: {video_id}")
# Get the transcript for that video ID
transcript = get_auto_transcript(video_id)
if not transcript:
logging.error(f"Error: could not retrieve transcript for session {session_id}. Assistant won't be called.")
return (False, "Successfully parsed video ID from URL, however the ID was either invalid, the transcript was disabled by the video owner, or some other error was raised because of YouTube.", 200)
user_streams[session_id] = {
'output_stream': None, # Ensure output_stream is per user
'thread': None
}
# Create a new StreamOutput for the session
with stream_lock:
user_streams[session_id]['output_stream'] = StreamOutput()
thread = threading.Thread(
name=f"create_stream_{session_id}",
target=create_and_stream,
args=(transcript, session_id)
)
user_streams[session_id]['thread'] = thread
logging.info(f"Stream preparation complete for session {session_id}, sending reply.")
return (True, None, None)
def get_video_id(url):
youtu_be = r'(?<=youtu.be/)([A-Za-z0-9_-]{11})'
youtube_com = r'(?<=youtube\.com\/watch\?v=)([A-Za-z0-9_-]{11})'
id_match = re.search(youtu_be, url)
if not id_match:
id_match = re.search(youtube_com, url)
if not id_match:
# Couldn't parse video ID from URL
logging.warning(f"Failed to parse video ID from URL: {url}")
return None
return id_match.group(1)
def get_auto_transcript(video_id):
trans_api_errors = youtube_transcript_api._errors
try:
transcript = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'], proxies=None, cookies=None, preserve_formatting=False)
except trans_api_errors.TranscriptsDisabled as e:
logging.exception(f"Exception while fetching transcript: {e}")
return None
formatter = TextFormatter() # Ensure that you create an instance of TextFormatter
txt_transcript = formatter.format_transcript(transcript)
logging.info("Transcript successfully retrieved and formatted.")
return txt_transcript
# Initialize output stream
output_stream = StreamOutput()
logging.info(f"Main initialized at {datetime.now(pytz.timezone('America/New_York')).strftime('%Y-%m-%d %H:%M:%S')}. Presumably application starting.")
"""
Main module that handles processing of YouTube transcripts and connecting to the AI service.
Each user session has its own output stream and thread to handle the asynchronous AI response.
"""
import re
import threading
import asyncio
from asyncio import sleep
from datetime import datetime
import pytz
import os
import logging
import uuid
# Youtube Transcript imports
import youtube_transcript_api._errors
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import TextFormatter
# OpenAI API imports
from openai import AssistantEventHandler
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
# Global dict for per-user session streams.
user_streams = {}
# Lock to ensure thread-safe operations on shared memory.
stream_lock = threading.Lock()
# For running async code in non-async functions.
awaiter = asyncio.run
# Configure logging
try:
logging.basicConfig(
filename='./logs/main.log',
level=logging.INFO,
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
except FileNotFoundError as e:
with open("./logs/main.log", "x"):
pass
logging.basicConfig(
filename='./logs/main.log',
level=logging.INFO,
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
logging.info(f"No main.log file was found ({e}), so one was created.")
class StreamOutput:
"""
Class to encapsulate a session's streaming output.
Attributes:
delta (str): Last delta update.
response (str): Cumulative response from the AI.
done (bool): Flag indicating if streaming is complete.
buffer (list): List of output delta strings pending streaming.
"""
def __init__(self):
self.delta: str = ""
self.response: str = ""
self.done: bool = False
self.buffer: list = []
def reset(self):
"""
Reset the stream output to its initial state.
"""
self.delta = ""
self.response = ""
self.done = False
self.buffer = []
def send_delta(self, delta):
"""
Process a new delta string. This method is a synchronous wrapper that calls the async
method process_delta.
Args:
delta (str): The delta string to process.
"""
awaiter(self.process_delta(delta))
async def process_delta(self, delta):
"""
Process a new delta chunk asynchronously to update buffering.
Args:
delta (str): The delta portion of the response.
"""
self.delta = delta
self.response += delta
def get_index(lst):
return 0 if not lst else len(lst) - 1
if self.buffer:
try:
if self.delta != self.buffer[get_index(self.buffer)]:
self.buffer.append(delta)
except IndexError as index_error:
logging.error(f"Caught IndexError: {str(index_error)}")
self.buffer.append(delta)
else:
self.buffer.append(delta)
return
# OpenAI Client configuration
client = OpenAI(
organization='org-7ANUFsqOVIXLLNju8Rvmxu3h',
project="proj_NGz8Kux8CSka7DRJucAlDCz6",
api_key=os.getenv("OPENAI_API_KEY")
)
asst_screw_bardo_id = "asst_JGFaX6uOIotqy5mIJnu3Yyp7" # Assistant ID for processing
class EventHandler(AssistantEventHandler):
"""
Event handler for processing OpenAI assistant events.
Attributes:
output_stream (StreamOutput): The output stream to write updates to.
"""
def __init__(self, output_stream: StreamOutput):
"""
Initialize the event handler with a specific output stream.
Args:
output_stream (StreamOutput): The session specific stream output instance.
"""
super().__init__()
self.output_stream = output_stream
def on_text_created(self, text) -> None:
"""
Event triggered when text is first created.
Args:
text (str): The initial response text.
"""
self.output_stream.send_delta("Response Received:\n\nScrew-Bardo:\n\n")
logging.info("Text created event handled.")
def on_text_delta(self, delta, snapshot):
"""
Event triggered when a new text delta is available.
Args:
delta (Any): Object that contains the new delta information.
snapshot (Any): A snapshot of the current output (if applicable).
"""
self.output_stream.send_delta(delta.value)
logging.debug(f"Text delta received: {delta.value}")
def on_tool_call_created(self, tool_call):
"""
Handle the case when the assistant attempts to call a tool.
Raises an exception as this behavior is unexpected.
Args:
tool_call (Any): The tool call info.
Raises:
Exception: Always, since tool calls are not allowed.
"""
error_msg = "Assistant shouldn't be calling tools."
logging.error(error_msg)
raise Exception(error_msg)
def create_and_stream(transcript, session_id):
"""
Create a new thread that runs the OpenAI stream for a given session and transcript.
Args:
transcript (str): The transcript from the YouTube video.
session_id (str): The unique session identifier.
"""
logging.info(f"Starting OpenAI stream thread for session {session_id}.")
event_handler = EventHandler(user_streams[session_id]['output_stream'])
try:
with client.beta.threads.create_and_run_stream(
assistant_id=asst_screw_bardo_id,
thread={
"messages": [{"role": "user", "content": transcript}]
},
event_handler=event_handler
) as stream:
stream.until_done()
with stream_lock:
user_streams[session_id]['output_stream'].done = True
logging.info(f"OpenAI stream completed for session {session_id}.")
except Exception as e:
logging.exception(f"Exception occurred during create_and_stream for session {session_id}.")
def yoink(session_id):
"""
Generator that yields streaming output for a session.
This function starts the AI response thread, then continuously yields data from the session's output buffer
until the response is marked as done.
Args:
session_id (str): The unique session identifier.
Yields:
bytes: Chunks of the AI generated response.
"""
logging.info(f"Starting stream for session {session_id}...")
with stream_lock:
user_data = user_streams.get(session_id)
if not user_data:
logging.critical(f"User data not found for session id {session_id}?")
return
output_stream: StreamOutput = user_data.get('output_stream')
thread: threading.Thread = user_data.get('thread')
thread.start()
while True:
if not output_stream or not thread:
logging.error(f"No output stream/thread for session {session_id}.")
break
# Stop streaming when done and there is no pending buffered output.
if output_stream.done and not output_stream.buffer:
break
try:
if output_stream.buffer:
delta = output_stream.buffer.pop(0)
yield bytes(delta, encoding="utf-8")
else:
# A short sleep before looping again
asyncio.run(sleep(0.018))
except Exception as e:
logging.exception(f"Exception occurred during streaming for session {session_id}: {e}")
break
logging.info(f"Stream completed successfully for session {session_id}.")
logging.info(f"Completed Assistant Response for session {session_id}:\n{output_stream.response}")
with stream_lock:
thread.join()
# Clean up the session data once done.
del user_streams[session_id]
logging.info(f"Stream thread joined and resources cleaned up for session {session_id}.")
def process(url, session_id):
"""
Process a YouTube URL: parse the video id, retrieve its transcript, and prepare the session for AI processing.
Args:
url (str): The YouTube URL provided by the user.
session_id (str): The unique session identifier.
Returns:
tuple: (success (bool), message (str or None), status_code (int or None))
"""
current_time = datetime.now(pytz.timezone('America/New_York')).strftime('%Y-%m-%d %H:%M:%S')
logging.info(f"New Entry at {current_time} for session {session_id}")
logging.info(f"URL: {url}")
video_id = get_video_id(url)
if not video_id:
logging.warning(f"Could not parse video id from URL: {url}")
return (False, "Couldn't parse video ID from URL. (Are you sure you entered a valid YouTube.com or YouTu.be URL?)", 400)
logging.info(f"Parsed Video ID: {video_id}")
transcript = get_auto_transcript(video_id)
if not transcript:
logging.error(f"Error: could not retrieve transcript for session {session_id}. Assistant won't be called.")
return (False, "Successfully parsed video ID from URL, however the transcript was disabled by the video owner or invalid.", 200)
# Initialize session data for streaming.
user_streams[session_id] = {
'output_stream': None,
'thread': None
}
with stream_lock:
user_streams[session_id]['output_stream'] = StreamOutput()
thread = threading.Thread(
name=f"create_stream_{session_id}",
target=create_and_stream,
args=(transcript, session_id)
)
user_streams[session_id]['thread'] = thread
logging.info(f"Stream preparation complete for session {session_id}, sending reply.")
return (True, None, None)
def get_video_id(url):
"""
Extract the YouTube video ID from a URL.
Args:
url (str): The YouTube URL.
Returns:
str or None: The video ID if found, otherwise None.
"""
youtu_be = r'(?<=youtu.be/)([A-Za-z0-9_-]{11})'
youtube_com = r'(?<=youtube\.com\/watch\?v=)([A-Za-z0-9_-]{11})'
id_match = re.search(youtu_be, url)
if not id_match:
id_match = re.search(youtube_com, url)
if not id_match:
logging.warning(f"Failed to parse video ID from URL: {url}")
return None
return id_match.group(1)
def get_auto_transcript(video_id):
"""
Retrieve and format the transcript from a YouTube video.
Args:
video_id (str): The YouTube video identifier.
Returns:
str or None: The formatted transcript if successful; otherwise None.
"""
trans_api_errors = youtube_transcript_api._errors
try:
transcript = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'], proxies=None, cookies=None, preserve_formatting=False)
except trans_api_errors.TranscriptsDisabled as e:
logging.exception(f"Exception while fetching transcript: {e}")
return None
formatter = TextFormatter()
txt_transcript = formatter.format_transcript(transcript)
logging.info("Transcript successfully retrieved and formatted.")
return txt_transcript
# Initialize a global output_stream just for main module logging (not used for per-session streaming).
output_stream = StreamOutput()
logging.info(f"Main initialized at {datetime.now(pytz.timezone('America/New_York')).strftime('%Y-%m-%d %H:%M:%S')}. Application starting.")

2
app/start.sh Normal file
View File

@@ -0,0 +1,2 @@
#!/bin/bash
exec gunicorn -b 0.0.0.0:1986 -w 4 --thread 2 --log-level debug app:app --timeout 120 --worker-class gthread --access-logfile - --error-logfile - --capture-output

View File

@@ -7,19 +7,23 @@
<title>Screw You Bardo</title>
<link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
<link rel="icon" type="image/x-icon" href="https://www.foreverpyrite.com/favicon.ico">
<script src="https://unpkg.com/htmx.org@2.0.4" integrity="sha384-HGfztofotfshcF7+8n44JQL2oJmowVChPTg48S+jvZoztPfvwD79OC/LTtG6dMp+" crossorigin="anonymous"></script>
<script defer src="{{ url_for('static', filename='script.js') }}"></script>
</head>
<body>
<main class="container">
<section id="response-section">
<pre id="response-area">Response will appear here.</pre>
<body class="font-sans flex justify-center items-center h-screen bg-[#1F1F1F] text-white">
<main class="flex flex-col w-11/12 h-[90vh] bg-[#2E2E2E] rounded-lg shadow-lg overflow-hidden">
<section id="response-section" class="flex-1 p-5 bg-[#1E1E1E] overflow-y-auto text-base leading-relaxed scroll-smooth">
<pre id="response-area" class="whitespace-pre-wrap">Response will appear here.</pre>
</section>
<section class="form-section">
<form id="url-form">
<input type="url" id="url_box" name="url" placeholder="Paste the lecture URL here." required autofocus>
<button type="submit" id="submit">Submit</button>
<section class="py-4 px-5 bg-[#3A3A3A]">
<form id="url-form" hx-post="/process_url" hx-swap="none" class="flex gap-2">
<input id="url_box" type="url" name="url" placeholder="Paste the lecture URL here." required autofocus
class="flex-1 py-2 px-3 bg-[#4A4A4A] text-white text-base rounded-md focus:outline-none placeholder-[#B0B0B0]">
<button type="submit" id="submit" class="py-2 px-5 bg-[#5A5A5A] text-white text-base rounded-md hover:bg-[#7A7A7A] disabled:bg-[#3A3A3A] disabled:cursor-not-allowed">
Submit
</button>
</form>
</section>
</main>

View File

@@ -1,88 +1,71 @@
document.addEventListener("DOMContentLoaded", () => {
const responseArea = document.getElementById('response-area');
const responseSection = document.getElementById('response-section');
const submitButton = document.getElementById('submit');
const urlForm = document.getElementById('url-form');
const urlBox = document.getElementById('url_box');
urlForm.addEventListener('submit', function(event) {
event.preventDefault(); // Prevent form from submitting the traditional way
const url = urlBox.value.trim();
if (!url) {
responseArea.innerText = 'Please enter a URL.';
return;
}
// Clear the input and update UI
urlBox.value = "";
submitButton.disabled = true;
responseArea.innerText = 'Processing...';
// Process the URL
fetch('/process_url', {
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
},
body: new URLSearchParams({ url: url })
})
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
return response.text();
})
.then(text => {
if (text === "Processing started. Check /stream_output for updates.") {
streamOutput(responseArea);
} else {
responseArea.innerText = text;
submitButton.disabled = false;
}
})
.catch(error => {
console.error('Error processing URL:', error);
responseArea.innerText = 'Error processing URL: ' + error.message;
submitButton.disabled = false;
});
});
function streamOutput(responseArea) {
// Fetch the streaming output
fetch('/stream_output')
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
const reader = response.body.getReader();
const decoder = new TextDecoder("utf-8");
responseArea.innerHTML = "";
function readStream() {
reader.read().then(({ done, value }) => {
if (done) {
submitButton.disabled = false;
return;
}
const chunk = decoder.decode(value, { stream: true });
responseArea.innerHTML += chunk;
responseSection.scrollTop = responseSection.scrollHeight
readStream();
}).catch(error => {
console.error('Error reading stream:', error);
responseArea.innerText = 'Error reading stream: ' + error.message;
submitButton.disabled = false;
});
}
readStream();
})
.catch(error => {
console.error('Error fetching stream:', error);
responseArea.innerText = 'Error fetching stream: ' + error.message;
submitButton.disabled = false;
});
}
});
document.addEventListener("DOMContentLoaded", () => {
const responseArea = document.getElementById('response-area');
const responseSection = document.getElementById('response-section');
const submitButton = document.getElementById('submit');
const urlBox = document.getElementById('url_box');
// Before sending HTMX request, prepare UI and handle empty input
document.body.addEventListener('htmx:beforeRequest', function(evt) {
if (evt.detail.elt.id === 'url-form') {
const url = urlBox.value.trim();
if (!url) {
evt.detail.shouldCancel = true;
responseArea.innerText = 'Please enter a URL.';
return;
}
urlBox.value = '';
submitButton.disabled = true;
responseArea.innerText = 'Processing...';
}
});
document.body.addEventListener('htmx:afterRequest', function(evt) {
if (evt.detail.elt.id === 'url-form') {
const text = evt.detail.xhr.responseText.trim();
if (text === "Processing started. Check /stream_output for updates.") {
streamOutput(responseArea, responseSection, submitButton);
} else {
responseArea.innerText = text;
submitButton.disabled = false;
}
}
});
function streamOutput(responseArea, responseSection, submitButton) {
// Fetch the streaming output
fetch('/stream_output')
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
const reader = response.body.getReader();
const decoder = new TextDecoder("utf-8");
responseArea.innerHTML = "";
function readStream() {
reader.read().then(({ done, value }) => {
if (done) {
submitButton.disabled = false;
return;
}
const chunk = decoder.decode(value, { stream: true });
responseArea.innerHTML += chunk;
responseSection.scrollTop = responseSection.scrollHeight;
readStream();
}).catch(error => {
console.error('Error reading stream:', error);
responseArea.innerText = 'Error reading stream: ' + error.message;
submitButton.disabled = false;
});
}
readStream();
})
.catch(error => {
console.error('Error fetching stream:', error);
responseArea.innerText = 'Error fetching stream: ' + error.message;
submitButton.disabled = false;
});
}
});

View File

@@ -1,109 +1,3 @@
@font-face {
font-family: 'NimbusSansD';
src: url('font-files/nimbus-sans-d-ot-light.woff2') format('woff2'),
url('font-files/nimbus-sans-d-ot-light.woff') format('woff');
font-weight: normal;
font-style: normal;
}
@font-face{font-display:swap;font-family:NimbusSansD;font-style:normal;font-weight:400;src:url(/static/font/nimbus-sans-d-ot-light.woff2) format("woff2"),url(/static/font/nimbus-sans-d-ot-light.woff) format("woff")}*,:after,:before{--tw-border-spacing-x:0;--tw-border-spacing-y:0;--tw-translate-x:0;--tw-translate-y:0;--tw-rotate:0;--tw-skew-x:0;--tw-skew-y:0;--tw-scale-x:1;--tw-scale-y:1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness:proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width:0px;--tw-ring-offset-color:#fff;--tw-ring-color:rgba(59,130,246,.5);--tw-ring-offset-shadow:0 0 #0000;--tw-ring-shadow:0 0 #0000;--tw-shadow:0 0 #0000;--tw-shadow-colored:0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: ;--tw-contain-size: ;--tw-contain-layout: ;--tw-contain-paint: ;--tw-contain-style: }::backdrop{--tw-border-spacing-x:0;--tw-border-spacing-y:0;--tw-translate-x:0;--tw-translate-y:0;--tw-rotate:0;--tw-skew-x:0;--tw-skew-y:0;--tw-scale-x:1;--tw-scale-y:1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness:proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width:0px;--tw-ring-offset-color:#fff;--tw-ring-color:rgba(59,130,246,.5);--tw-ring-offset-shadow:0 0 #0000;--tw-ring-shadow:0 0 #0000;--tw-shadow:0 0 #0000;--tw-shadow-colored:0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: ;--tw-contain-size: ;--tw-contain-layout: ;--tw-contain-paint: ;--tw-contain-style: }
* {
box-sizing: border-box;
margin: 0;
padding: 0;
font-family: 'NimbusSansD', sans-serif;
color: #FFFFFF;
}
body {
display: flex;
justify-content: center;
align-items: center;
height: 100vh;
background-color: #1F1F1F;
}
.container {
display: flex;
flex-direction: column;
width: 85vw;
height: 90vh;
background-color: #2E2E2E;
border-radius: 10px;
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
overflow: hidden;
}
#response-section {
flex: 1;
padding: 20px;
background-color: #1E1E1E;
overflow-y: auto;
font-size: 1rem;
line-height: 1.5;
scroll-behavior: smooth;
}
.form-section {
padding: 15px 20px;
background-color: #3A3A3A;
}
#response-area {
white-space: pre-wrap;
}
#url-form {
display: flex;
gap: 10px;
}
#url_box {
flex: 1;
padding: 10px 15px;
border: none;
border-radius: 5px;
background-color: #4A4A4A;
color: #FFFFFF;
font-size: 1rem;
outline: none;
}
#url_box::placeholder {
color: #B0B0B0;
}
#submit {
padding: 10px 20px;
border: none;
border-radius: 5px;
background-color: #5A5A5A;
color: #FFFFFF;
font-size: 1rem;
cursor: pointer;
transition: background-color 0.3s ease;
}
#submit:hover {
background-color: #7A7A7A;
}
#submit:disabled {
background-color: #3A3A3A;
cursor: not-allowed;
}
/* Responsive Adjustments */
@media (max-width: 600px) {
.container {
height: 95vh;
}
#url_box {
font-size: 0.9rem;
}
#submit {
font-size: 0.9rem;
padding: 10px;
}
}
/*! tailwindcss v3.4.15 | MIT License | https://tailwindcss.com*/*,:after,:before{border:0 solid #e5e7eb;box-sizing:border-box}:after,:before{--tw-content:""}:host,html{line-height:1.5;-webkit-text-size-adjust:100%;font-family:NimbusSansD,sans-serif;font-feature-settings:normal;font-variation-settings:normal;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-tap-highlight-color:transparent}body{line-height:inherit;margin:0}hr{border-top-width:1px;color:inherit;height:0}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}a{color:inherit;text-decoration:inherit}b,strong{font-weight:bolder}code,kbd,pre,samp{font-family:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace;font-feature-settings:normal;font-size:1em;font-variation-settings:normal}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}table{border-collapse:collapse;border-color:inherit;text-indent:0}button,input,optgroup,select,textarea{color:inherit;font-family:inherit;font-feature-settings:inherit;font-size:100%;font-variation-settings:inherit;font-weight:inherit;letter-spacing:inherit;line-height:inherit;margin:0;padding:0}button,select{text-transform:none}button,input:where([type=button]),input:where([type=reset]),input:where([type=submit]){-webkit-appearance:button;background-color:transparent;background-image:none}:-moz-focusring{outline:auto}:-moz-ui-invalid{box-shadow:none}progress{vertical-align:baseline}::-webkit-inner-spin-button,::-webkit-outer-spin-button{height:auto}[type=search]{-webkit-appearance:textfield;outline-offset:-2px}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}summary{display:list-item}blockquote,dd,dl,figure,h1,h2,h3,h4,h5,h6,hr,p,pre{margin:0}fieldset{margin:0}fieldset,legend{padding:0}menu,ol,ul{list-style:none;margin:0;padding:0}dialog{padding:0}textarea{resize:vertical}input::-moz-placeholder,textarea::-moz-placeholder{color:#9ca3af;opacity:1}input::placeholder,textarea::placeholder{color:#9ca3af;opacity:1}[role=button],button{cursor:pointer}:disabled{cursor:default}audio,canvas,embed,iframe,img,object,svg,video{display:block;vertical-align:middle}img,video{height:auto;max-width:100%}[hidden]:where(:not([hidden=until-found])){display:none}.static{position:static}.flex{display:flex}.h-\[90vh\]{height:90vh}.h-screen{height:100vh}.w-11\/12{width:91.666667%}.flex-1{flex:1 1 0%}.flex-col{flex-direction:column}.items-center{align-items:center}.justify-center{justify-content:center}.gap-2{gap:.5rem}.overflow-hidden{overflow:hidden}.overflow-y-auto{overflow-y:auto}.scroll-smooth{scroll-behavior:smooth}.whitespace-pre-wrap{white-space:pre-wrap}.rounded-lg{border-radius:.5rem}.rounded-md{border-radius:.375rem}.bg-\[\#1E1E1E\]{--tw-bg-opacity:1;background-color:rgb(30 30 30/var(--tw-bg-opacity,1))}.bg-\[\#1F1F1F\]{--tw-bg-opacity:1;background-color:rgb(31 31 31/var(--tw-bg-opacity,1))}.bg-\[\#2E2E2E\]{--tw-bg-opacity:1;background-color:rgb(46 46 46/var(--tw-bg-opacity,1))}.bg-\[\#3A3A3A\]{--tw-bg-opacity:1;background-color:rgb(58 58 58/var(--tw-bg-opacity,1))}.bg-\[\#4A4A4A\]{--tw-bg-opacity:1;background-color:rgb(74 74 74/var(--tw-bg-opacity,1))}.bg-\[\#5A5A5A\]{--tw-bg-opacity:1;background-color:rgb(90 90 90/var(--tw-bg-opacity,1))}.p-5{padding:1.25rem}.px-3{padding-left:.75rem;padding-right:.75rem}.px-5{padding-left:1.25rem;padding-right:1.25rem}.py-2{padding-bottom:.5rem;padding-top:.5rem}.py-4{padding-bottom:1rem;padding-top:1rem}.font-sans{font-family:NimbusSansD,sans-serif}.text-base{font-size:1rem;line-height:1.5rem}.leading-relaxed{line-height:1.625}.text-white{--tw-text-opacity:1;color:rgb(255 255 255/var(--tw-text-opacity,1))}.placeholder-\[\#B0B0B0\]::-moz-placeholder{--tw-placeholder-opacity:1;color:rgb(176 176 176/var(--tw-placeholder-opacity,1))}.placeholder-\[\#B0B0B0\]::placeholder{--tw-placeholder-opacity:1;color:rgb(176 176 176/var(--tw-placeholder-opacity,1))}.shadow-lg{--tw-shadow:0 10px 15px -3px rgba(0,0,0,.1),0 4px 6px -4px rgba(0,0,0,.1);--tw-shadow-colored:0 10px 15px -3px var(--tw-shadow-color),0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow,0 0 #0000),var(--tw-ring-shadow,0 0 #0000),var(--tw-shadow)}.hover\:bg-\[\#7A7A7A\]:hover{--tw-bg-opacity:1;background-color:rgb(122 122 122/var(--tw-bg-opacity,1))}.focus\:outline-none:focus{outline:2px solid transparent;outline-offset:2px}.disabled\:cursor-not-allowed:disabled{cursor:not-allowed}.disabled\:bg-\[\#3A3A3A\]:disabled{--tw-bg-opacity:1;background-color:rgb(58 58 58/var(--tw-bg-opacity,1))}

View File

@@ -3,7 +3,7 @@ services:
build: .
container_name: screw-bardo-container
ports:
- "$PORT:1986"
- "1986:1986"
volumes:
- ./logs:/app/logs
- ./app/logs:/app/logs
restart: unless-stopped

3
src/build-css.sh Normal file
View File

@@ -0,0 +1,3 @@
#!/bin/bash
cd "$(dirname "$0")"
./tailwindcss -i input.css -o ../app/website/static/style.css --minify

14
src/input.css Normal file
View File

@@ -0,0 +1,14 @@
@font-face {
font-family: 'NimbusSansD';
src: url('/static/font/nimbus-sans-d-ot-light.woff2') format('woff2'),
url('/static/font/nimbus-sans-d-ot-light.woff') format('woff');
font-weight: normal;
font-style: normal;
font-display: swap;
}
@tailwind base;
@tailwind components;
@tailwind utilities;

13
src/tailwind.config.js Normal file
View File

@@ -0,0 +1,13 @@
/** @type {import('tailwindcss').Config} */
module.exports = {
content: ["../app/website/**/*.html"],
theme: {
extend: {
fontFamily: {
sans: ['NimbusSansD', 'sans-serif'],
},
},
},
plugins: [],
}

BIN
src/tailwindcss Normal file

Binary file not shown.

View File

@@ -1,3 +0,0 @@
#!/bin/bash
cd ./app
exec gunicorn -b 0.0.0.0:1986 --log-level debug app:app