AI-powered web search, content extraction, and website crawling SDK for Node.js.
npm install search-agentimport { SearchClient } from 'search-agent';
const client = new SearchClient({
clientId: process.env.OBLIEN_CLIENT_ID,
clientSecret: process.env.OBLIEN_CLIENT_SECRET
});
// Search with AI-generated answers
const results = await client.search(
['What is machine learning?'],
{ summaryLevel: 'intelligent', includeAnswers: true }
);
console.log(results[0].answer);Get your API credentials from Oblien Dashboard.
const client = new SearchClient({
clientId: 'your-client-id',
clientSecret: 'your-client-secret',
apiURL: 'https://api.oblien.com' // optional, defaults to production
});Perform AI-powered web searches with batch processing support.
client.search(queries, options)queries (string[], required)
- Array of search queries to execute
- Supports batch processing of multiple queries
includeAnswers (boolean, optional, default: false)
- Generate AI-powered answers for search results
- Answers are created using advanced language models
options (object, optional)
includeMetadata(boolean) - Include search metadata in responsesummaryLevel('low' | 'medium' | 'intelligent') - AI answer quality levellow- Fast, basic answersmedium- Balanced quality and speed (recommended)intelligent- Highest quality, slower generation
maxResults(number) - Maximum number of results per querylanguage(string) - Language code (e.g., 'en', 'es', 'fr', 'de')region(string) - Region code (e.g., 'us', 'uk', 'eu', 'asia')freshness('day' | 'week' | 'month' | 'year' | 'all') - Result freshness filterstartDate(string) - Start date for range filtering (ISO 8601 format)endDate(string) - End date for range filtering (ISO 8601 format)includeImages(boolean) - Include images in search resultsincludeImageDescriptions(boolean) - Include AI-generated image descriptionsincludeFavicon(boolean) - Include website faviconsincludeRawContent('none' | 'with_links' | 'with_images_and_links') - Raw content inclusion levelchunksPerSource(number) - Number of content chunks per sourcecountry(string) - Country code filterincludeDomains(string[]) - Array of domains to include in searchexcludeDomains(string[]) - Array of domains to exclude from searchsearchTopic('general' | 'news' | 'finance') - Search topic categorysearchDepth('basic' | 'advanced') - Search depth leveltimeRange('none' | 'day' | 'week' | 'month' | 'year') - Time range filter
Promise resolving to array of search results. Each result contains:
{
success: boolean, // Whether the search succeeded
query: string, // Original search query
results: Array<{
url: string, // Result URL
title: string, // Result title
content: string, // Result content/snippet
// Additional fields based on options
}>,
answer?: string, // AI-generated answer (if includeAnswers=true)
metadata?: object, // Search metadata (if includeMetadata=true)
time_took: number // Time taken in milliseconds
}Basic search:
const results = await client.search(['TypeScript tutorial']);
console.log(results[0].results[0].title);
console.log(results[0].results[0].url);Search with AI answers:
const results = await client.search(
['What is quantum computing?'],
{ summaryLevel: 'intelligent', includeAnswers: true }
);
console.log(results[0].answer);Batch search with options:
const results = await client.search(
[
'Latest AI developments',
'Machine learning best practices',
'Neural networks explained'
],
{
includeAnswers: true,
summaryLevel: 'medium',
maxResults: 5,
freshness: 'week',
language: 'en',
region: 'us'
}
);
results.forEach(result => {
console.log(`Query: ${result.query}`);
console.log(`Answer: ${result.answer}`);
console.log(`Results count: ${result.results.length}`);
});Advanced filtering:
const results = await client.search(
['climate change research'],
{
includeDomains: ['nature.com', 'science.org', 'pnas.org'],
searchTopic: 'general',
freshness: 'year',
includeMetadata: true
}
);Extract specific content from web pages using AI-powered extraction.
client.extract(pages, options)pages (array, required)
- Array of page objects to extract content from
- Each page must include:
url(string, required) - Page URL to extract fromdetails(string[], required) - Array of extraction instructionssummaryLevel('low' | 'medium' | 'intelligent', optional) - Extraction quality level
options (object, optional)
includeMetadata(boolean) - Include extraction metadatatimeout(number) - Request timeout in milliseconds (default: 30000)maxContentLength(number) - Maximum content length to extractformat('markdown' | 'html' | 'text') - Output format (default: 'markdown')extractDepth('basic' | 'advanced') - Extraction depth levelincludeImages(boolean) - Include images in extracted contentincludeFavicon(boolean) - Include page faviconmaxLength(number) - Maximum extraction length
Promise resolving to extraction response:
{
success: boolean, // Whether extraction succeeded
data: Array<{
result: object, // Extracted content
page: {
url: string,
details: string[],
summaryLevel: string
}
}>,
errors: string[], // Array of error messages (if any)
time_took: number // Time taken in milliseconds
}Basic extraction:
const extracted = await client.extract([
{
url: 'https://example.com/article',
details: [
'Extract the article title',
'Extract the main content',
'Extract the author name'
]
}
]);
if (extracted.success) {
console.log(extracted.data[0].result);
}Batch extraction with different instructions:
const extracted = await client.extract([
{
url: 'https://example.com/blog/post-1',
details: [
'Extract blog post title',
'Extract publication date',
'Extract main content',
'Extract tags'
],
summaryLevel: 'medium'
},
{
url: 'https://example.com/products',
details: [
'Extract all product names',
'Extract product prices',
'Extract product descriptions'
],
summaryLevel: 'intelligent'
},
{
url: 'https://example.com/about',
details: [
'Extract company mission',
'Extract team members',
'Extract contact information'
]
}
], {
format: 'markdown',
timeout: 60000
});
extracted.data.forEach(item => {
console.log(`URL: ${item.page.url}`);
console.log(`Extracted:`, item.result);
});Advanced extraction with options:
const extracted = await client.extract([
{
url: 'https://research.example.com/paper',
details: [
'Extract paper abstract',
'Extract key findings',
'Extract methodology',
'Extract conclusions',
'Extract references'
],
summaryLevel: 'intelligent'
}
], {
format: 'markdown',
extractDepth: 'advanced',
includeMetadata: true,
maxContentLength: 100000
});Crawl and research websites with real-time streaming results.
client.crawl(instructions, onEvent, options)instructions (string, required)
- Natural language instructions for the crawl
- Must include the target URL and what to extract/research
- Example: "Crawl https://example.com/blog and extract all article titles and summaries"
onEvent (function, optional)
- Callback function for real-time streaming events
- Receives event objects with type and data
- Event types:
page_crawled- Page was successfully crawledcontent- Content chunk extractedthinking- AI thinking process (if enabled)error- Error occurredcrawl_end- Crawl completed
options (object, optional)
type('deep' | 'shallow' | 'focused') - Crawl type (default: 'deep')deep- Comprehensive crawl with detailed analysisshallow- Quick crawl of main pages onlyfocused- Targeted crawl based on instructions
thinking(boolean) - Enable AI thinking process (default: true)allow_thinking_callback(boolean) - Stream thinking events to callback (default: true)stream_text(boolean) - Stream text results to callback (default: true)maxDepth(number) - Maximum crawl depth (number of link levels)maxPages(number) - Maximum number of pages to crawlincludeExternal(boolean) - Include external links in crawltimeout(number) - Request timeout in millisecondscrawlDepth('basic' | 'advanced') - Crawl depth levelformat('markdown' | 'html' | 'text') - Output formatincludeImages(boolean) - Include images in crawled contentincludeFavicon(boolean) - Include faviconsfollowLinks(boolean) - Follow links during crawl
Promise resolving to final crawl response:
{
success: boolean, // Whether crawl succeeded
time_took: number // Time taken in milliseconds
}Basic crawl:
const result = await client.crawl(
'Crawl https://example.com/blog and summarize all blog posts'
);
console.log(`Completed in ${result.time_took}ms`);Crawl with event streaming:
const result = await client.crawl(
'Crawl https://example.com/docs and extract all API endpoints with their descriptions',
(event) => {
if (event.type === 'page_crawled') {
console.log(`Crawled page: ${event.url}`);
} else if (event.type === 'content') {
console.log(`Content found: ${event.text}`);
} else if (event.type === 'thinking') {
console.log(`AI thinking: ${event.thought}`);
}
},
{
type: 'deep',
maxPages: 20
}
);Advanced crawl with options:
let crawledPages = [];
let extractedContent = [];
const result = await client.crawl(
'Research https://news.example.com and find all articles about artificial intelligence from the last week',
(event) => {
switch (event.type) {
case 'page_crawled':
crawledPages.push(event.url);
console.log(`Progress: ${crawledPages.length} pages crawled`);
break;
case 'content':
extractedContent.push(event.text);
console.log(`Extracted content length: ${event.text.length} chars`);
break;
case 'thinking':
console.log(`AI Analysis: ${event.thought}`);
break;
case 'error':
console.error(`Error: ${event.error}`);
break;
}
},
{
type: 'focused',
thinking: true,
maxDepth: 3,
maxPages: 50,
includeExternal: false,
format: 'markdown'
}
);
console.log(`Crawl completed:`);
console.log(`- Total pages: ${crawledPages.length}`);
console.log(`- Content chunks: ${extractedContent.length}`);
console.log(`- Time: ${result.time_took}ms`);Research-focused crawl:
const result = await client.crawl(
`Crawl https://company.example.com and create a comprehensive report including:
- Company overview and mission
- Product offerings and features
- Pricing information
- Customer testimonials
- Contact information and locations`,
(event) => {
if (event.type === 'content') {
// Process extracted content in real-time
processAndStore(event.text);
}
},
{
type: 'deep',
thinking: true,
maxDepth: 4,
format: 'markdown'
}
);All methods throw errors for failed requests. Always use try-catch blocks:
try {
const results = await client.search(['test query']);
console.log(results);
} catch (error) {
console.error('Search failed:', error.message);
}Common error scenarios:
Missing credentials:
// Throws: 'clientId is required'
new SearchClient({ clientSecret: 'secret' });Invalid parameters:
// Throws: 'queries must be a non-empty array'
await client.search([]);
// Throws: 'Page at index 0 is missing required field: details'
await client.extract([{ url: 'https://example.com' }]);API errors:
try {
await client.search(['query']);
} catch (error) {
// Error message includes API error details
console.error(error.message);
}Full TypeScript definitions are included:
import { SearchClient, SearchOptions, SearchResult } from 'search-agent';
const client = new SearchClient({
clientId: process.env.OBLIEN_CLIENT_ID!,
clientSecret: process.env.OBLIEN_CLIENT_SECRET!
});
const options: SearchOptions = {
summaryLevel: 'intelligent',
maxResults: 10,
freshness: 'week',
includeAnswers: true
};
const results: SearchResult[] = await client.search(
['TypeScript best practices'],
options
);The API implements rate limiting. Refer to response headers for current limits:
X-RateLimit-Limit- Total requests allowed per windowX-RateLimit-Remaining- Requests remaining in current windowX-RateLimit-Reset- Window reset time (Unix timestamp)
- Batch related queries - Process multiple related searches in one request
- Use appropriate summary levels - 'low' for speed, 'intelligent' for quality
- Set reasonable maxResults - Typical range: 5-20 results
- Apply filters early - Use domain filters and freshness to improve relevance
// Good: Batch related queries
const results = await client.search(
['ML frameworks', 'ML tools', 'ML libraries'],
{ includeAnswers: true, summaryLevel: 'medium', maxResults: 5 }
);
// Better: Use filters for precision
const results = await client.search(
['latest research'],
{
includeAnswers: true,
includeDomains: ['arxiv.org', 'nature.com'],
freshness: 'month'
}
);- Be specific with details - Clear instructions produce better results
- Batch similar extractions - Process related pages together
- Set appropriate timeouts - Increase for large pages or slow sites
// Good: Specific extraction instructions
const extracted = await client.extract([
{
url: 'https://example.com',
details: [
'Extract the main heading (h1 tag)',
'Extract the first paragraph of content',
'Extract any pricing information mentioned'
]
}
]);
// Better: Include context in instructions
const extracted = await client.extract([
{
url: 'https://example.com/product',
details: [
'Extract product name from the title',
'Extract pricing from the price section',
'Extract feature list from the features section',
'Extract customer rating if available'
],
summaryLevel: 'intelligent'
}
], {
timeout: 60000,
format: 'markdown'
});- Provide clear instructions - Include URL and specific research goals
- Set appropriate limits - maxPages and maxDepth prevent runaway crawls
- Use event callbacks - Process results in real-time for large crawls
- Choose the right type - 'focused' for specific targets, 'deep' for comprehensive
// Good: Clear, focused instructions
await client.crawl(
'Crawl https://docs.example.com and extract all API method signatures',
(event) => processEvent(event),
{ type: 'focused', maxPages: 30 }
);
// Better: Detailed instructions with constraints
await client.crawl(
`Research https://example.com/blog for articles about:
- Cloud computing trends
- DevOps best practices
- Container orchestration
Extract title, date, and summary for each relevant article`,
(event) => {
if (event.type === 'content') {
saveToDatabase(event.text);
}
},
{
type: 'deep',
maxPages: 50,
maxDepth: 3,
thinking: true
}
);- Documentation: https://oblien.com/docs/search-api
- Dashboard: https://oblien.com/dashboard
- Issues: https://github.com/oblien/search-agent/issues
MIT