Skip to content

Commit 38bd11c

Browse files
feat: add shift-read demo app (closes #1761) (#1901)
* feat: add shift-read to community project * chore: add changeset * chore: cleanup unnecessary files and loc * fix: improve error handling, and UI polish * refactor: format url and improve error handling * fix: use empty changeset * fix: update translate to guard missing lingo key --------- Co-authored-by: Sumit Saurabh <62152915+sumitsaurabh927@users.noreply.github.com>
1 parent 7b9a7d0 commit 38bd11c

28 files changed

+12301
-0
lines changed

.changeset/add-shift-read.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
---
3+
4+
Add shift-read demo app to community projects

community/shift-read/.env.example

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
FIRECRAWL_API_KEY=firecrawl_api_key
2+
GROQ_API_KEY=groq_api_key
3+
GROQ_MODEL=meta-llama/llama-4-scout-17b-16e-instruct
4+
LINGODOTDEV_API_KEY=lingodotdev_api_key

community/shift-read/.gitignore

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
2+
3+
# dependencies
4+
/node_modules
5+
/.pnp
6+
.pnp.*
7+
.yarn/*
8+
!.yarn/patches
9+
!.yarn/plugins
10+
!.yarn/releases
11+
!.yarn/versions
12+
13+
# testing
14+
/coverage
15+
16+
# next.js
17+
/.next/
18+
/out/
19+
20+
# production
21+
/build
22+
23+
# misc
24+
.DS_Store
25+
*.pem
26+
27+
# debug
28+
npm-debug.log*
29+
yarn-debug.log*
30+
yarn-error.log*
31+
.pnpm-debug.log*
32+
33+
# env files (can opt-in for committing if needed)
34+
.env
35+
36+
# vercel
37+
.vercel
38+
39+
# typescript
40+
*.tsbuildinfo
41+
next-env.d.ts

community/shift-read/README.md

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
# Shift
2+
3+
Read any article on the internet in your preferred language.
4+
Shift extracts article content from web pages and translates it while preserving formatting, images, and typography.
5+
6+
- [Checkout Live Here](https://shift-read.vercel.app/)
7+
- [View Demo Here](https://5kas5z928t.ufs.sh/f/wBHVA4PQTleAKsb2NVrIL2VE9DjCy53AWlsMSoTNfqhc0U8J)
8+
9+
10+
## Features
11+
12+
- **Web Scraping**: Extract clean article content from any URL
13+
- **Translation**: Translate articles to 12+ languages while preserving Markdown formatting
14+
- **Beautiful Reading**: Clean, minimal reader mode with typography optimized for long-form content
15+
- **Language Toggle**: Seamlessly switch between original and translated content
16+
- **Dark Mode**: Toggle between light and dark themes
17+
- **Smart Caching**: Articles and translations are cached locally for instant re-access
18+
19+
## Tech Stack
20+
21+
- **Framework**: Next.js 16.1.4 (App Router)
22+
- **Language**: TypeScript
23+
- **Styling**: Tailwind CSS 4
24+
- **UI Components**: shadcn/ui
25+
- **Web Scraping**: @mendable/firecrawl-js
26+
- **Translation**: lingo.dev SDK
27+
- **Markdown Rendering**: react-markdown with remark-gfm and rehype-highlight
28+
- **Syntax Highlighting**: react-syntax-highlighter
29+
- **Caching**: localStorage with timestamp-based cache management
30+
31+
## What lingo.dev Feature It Highlights
32+
33+
Shift showcases **lingo.dev's Markdown translation capabilities**. The app demonstrates how lingo.dev can:
34+
35+
- Translate complex Markdown content while preserving formatting
36+
- Maintain document structure during translation
37+
- Provide seamless language switching for content-heavy applications
38+
39+
The translation preserves:
40+
- Headers and text hierarchy
41+
- Links and their targets
42+
- Lists, quotes, and other Markdown elements
43+
44+
## Installation
45+
46+
### Prerequisites
47+
48+
- Node.js 18+
49+
- pnpm (recommended) or npm/yarn
50+
51+
### Setup
52+
53+
1. **Clone the repository**
54+
```bash
55+
git clone <repository-url>
56+
cd shift-read
57+
```
58+
59+
2. **Install dependencies**
60+
```bash
61+
pnpm install
62+
```
63+
64+
3. **Set up environment variables**
65+
66+
Create a `.env` file in the project root and see .env.example for variables to add:
67+
```env
68+
FIRECRAWL_API_KEY=your_firecrawl_api_key_here
69+
LINGODOTDEV_API_KEY=your_lingodotdev_api_key_here
70+
GROQ_API_KEY=groq_api_key_here
71+
```
72+
73+
4. **Get API Keys**
74+
- **Firecrawl**: Sign up at [firecrawl.dev](https://firecrawl.dev) to get your API key
75+
- **Groq**: Sign up at [groq.com](https://groq.com) to get your API key
76+
- **lingo.dev**: Sign up at [lingo.dev](https://lingo.dev) to get your API key
77+
78+
5. **Run the development server**
79+
```bash
80+
pnpm dev
81+
```
82+
83+
6. **Open your browser**
84+
Navigate to [http://localhost:3000](http://localhost:3000)
85+
86+
## Running Locally
87+
88+
### Development Mode
89+
```bash
90+
pnpm dev
91+
```
92+
Starts the development server with hot reload at `http://localhost:3000`
93+
94+
### Build for Production
95+
```bash
96+
pnpm build
97+
```
98+
Creates an optimized production build
99+
100+
### Start Production Server
101+
```bash
102+
pnpm start
103+
```
104+
Runs the production build at `http://localhost:3000`
105+
106+
### Linting
107+
```bash
108+
pnpm lint
109+
```
110+
Runs ESLint to check for code issues
111+
112+
## Supported Languages
113+
114+
Shift supports translation to these languages:
115+
116+
- 🇪🇸 Spanish (es)
117+
- 🇫🇷 French (fr)
118+
- 🇩🇪 German (de)
119+
- 🇯🇵 Japanese (ja)
120+
- 🇨🇳 Chinese (zh)
121+
- 🇸🇦 Arabic (ar)
122+
- 🇮🇳 Hindi (hi)
123+
- 🇵🇹 Portuguese (pt)
124+
- 🇷🇺 Russian (ru)
125+
- 🇰🇷 Korean (ko)
126+
- 🇮🇹 Italian (it)
127+
- 🇳🇱 Dutch (nl)
128+
129+
The source language is automatically detected and filtered from the translation options.
130+
131+
## Project Structure
132+
133+
```text
134+
shift-read/
135+
├── app/
136+
│ ├── page.tsx # Homepage with URL input
137+
│ ├── layout.tsx # Root layout with providers
138+
│ ├── globals.css # Global styles (Tailwind v4)
139+
│ ├── read/[...url]/
140+
│ │ └── page.tsx # Reading page with article display
141+
│ └── actions/
142+
│ ├── fetchContent.ts # Firecrawl server action
143+
│ ├── translate.ts # lingo.dev server action
144+
│ └── cleanMarkdown.ts # Markdown cleanup utilities
145+
├── components/
146+
│ ├── ArticleHeader.tsx # Title, author, date, image display
147+
│ ├── LanguageSelector.tsx # Language dropdown
148+
│ ├── MDXRender.tsx # Markdown renderer with custom components
149+
│ └── ThemeToggle.tsx # Dark/light mode toggle
150+
├── lib/
151+
│ ├── utils.ts # Utility functions
152+
│ └── storage.ts # localStorage helpers
153+
├── README.md
154+
├── package.json
155+
├── next.config.ts
156+
└── tsconfig.json
157+
```
158+
159+
## How It Works
160+
161+
1. **URL Input**: User enters an article URL on the homepage
162+
2. **Content Extraction**: Firecrawl scrapes the URL and extracts clean Markdown content
163+
3. **Caching**: Article is cached in localStorage for instant future access
164+
4. **Translation**: User can select a target language and lingo.dev translates the content
165+
5. **Display**: Article is rendered with beautiful typography and preserved formatting
166+
6. **Toggle**: Users can switch between original and translated content seamlessly
167+
168+
---
169+
170+
- built by [mayank](https://mayankbansal.xyz)
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
"use server";
2+
3+
import { generateText } from "ai";
4+
import { createOpenAI } from "@ai-sdk/openai";
5+
import { z } from "zod";
6+
import { CLEANUP_SYSTEM_PROMPT } from "@/lib/system-prompt";
7+
8+
const groq = createOpenAI({
9+
baseURL: "https://api.groq.com/openai/v1",
10+
apiKey: process.env.GROQ_API_KEY,
11+
});
12+
13+
const CleanupResponseSchema = z.object({
14+
content: z.string().describe("Cleaned and formatted markdown content"),
15+
warnings: z
16+
.array(z.string())
17+
.optional()
18+
.describe("Any warnings or notes about the cleanup"),
19+
isComplete: z
20+
.boolean()
21+
.describe("Whether the cleanup was successful and content is readable"),
22+
metadata: z
23+
.object({
24+
title: z
25+
.string()
26+
.nullable()
27+
.optional()
28+
.describe("Extracted article title"),
29+
author: z
30+
.string()
31+
.nullable()
32+
.optional()
33+
.describe("Extracted author name"),
34+
publishedTime: z
35+
.string()
36+
.nullable()
37+
.optional()
38+
.describe("Publication date in ISO 8601 format"),
39+
ogImage: z
40+
.string()
41+
.nullable()
42+
.optional()
43+
.describe(
44+
"Featured image URL from markdown or fallback to firecrawl metadata",
45+
),
46+
})
47+
.optional(),
48+
});
49+
50+
export interface CleanedArticle {
51+
markdown: string;
52+
metadata: {
53+
title?: string;
54+
author?: string;
55+
publishedTime?: string;
56+
ogImage?: string;
57+
language?: string;
58+
};
59+
}
60+
61+
export async function cleanMarkdown(
62+
rawMarkdown: string,
63+
metadata?: Record<string, string | undefined>,
64+
): Promise<{ success: boolean; data?: CleanedArticle; error?: string }> {
65+
if (!process.env.GROQ_API_KEY) {
66+
return {
67+
success: false,
68+
error: "Groq API key is not configured",
69+
};
70+
}
71+
72+
try {
73+
const { text } = await generateText({
74+
model: groq(`${process.env.GROQ_MODEL || "meta-llama/llama-4-scout-17b-16e-instruct"}`),
75+
messages: [
76+
{
77+
role: "system",
78+
content: CLEANUP_SYSTEM_PROMPT,
79+
},
80+
{
81+
role: "user",
82+
content: `Clean the following scraped content. Extract metadata (title, author, date, image) and return only the main article body, excluding any title, featured image, ads, navigation, or related content.\n\n=== FIRECRAWL METADATA (FOR CONTEXT) ===\n${JSON.stringify(metadata || {}, null, 2)}\n\n=== CONTENT START ===\n${rawMarkdown}\n=== CONTENT END ===`,
83+
},
84+
],
85+
temperature: 0.2,
86+
});
87+
88+
let jsonString = text.trim();
89+
jsonString = jsonString
90+
.replace(/^```json\s*/i, "")
91+
.replace(/^```\s*/, "")
92+
.replace(/\s*```$/i, "")
93+
.trim();
94+
const jsonMatch = jsonString.match(/\{[\s\S]*\}/);
95+
if (!jsonMatch) {
96+
console.error("No JSON found in response:", text.substring(0, 500));
97+
return {
98+
success: false,
99+
error: "Failed to parse cleanup response",
100+
};
101+
}
102+
103+
let parsedJson;
104+
try {
105+
parsedJson = JSON.parse(jsonMatch[0]);
106+
} catch (parseError) {
107+
console.error("JSON parse error:", parseError);
108+
return {
109+
success: false,
110+
error: "Failed to parse cleanup response: invalid JSON format",
111+
};
112+
}
113+
114+
const cleaned = CleanupResponseSchema.parse(parsedJson);
115+
116+
if (!cleaned.isComplete || !cleaned.content.trim()) {
117+
return {
118+
success: false,
119+
error: "Could not extract meaningful content from the article",
120+
};
121+
}
122+
123+
return {
124+
success: true,
125+
data: {
126+
markdown: cleaned.content,
127+
metadata: {
128+
title: cleaned.metadata?.title || metadata?.title,
129+
author: cleaned.metadata?.author || metadata?.author,
130+
publishedTime:
131+
cleaned.metadata?.publishedTime || metadata?.publishedTime,
132+
ogImage: cleaned.metadata?.ogImage || metadata?.ogImage,
133+
language: metadata?.language,
134+
},
135+
},
136+
};
137+
} catch (error) {
138+
console.error("Markdown cleanup error:", error);
139+
return {
140+
success: false,
141+
error: "Failed to clean markdown content",
142+
};
143+
}
144+
}

0 commit comments

Comments
 (0)