# Lark Base Python SDK
Comprehensive Python SDK for managing Lark Base (Bitable) data with high-level operations for batch processing, data synchronization, and transformation.
## Features
- **Full CRUD Operations**: Complete support for tables, fields, and records
- **Batch Operations**: Efficient bulk create, update, and delete operations
- **Data Synchronization**: Sync data from external sources (TikTok, databases, APIs)
- **Data Transformation**: Built-in utilities for data mapping and validation
- **Rate Limiting**: Automatic rate limiting to prevent API throttling
- **Error Handling**: Comprehensive error handling and logging
- **Type Safety**: Full type hints for better IDE support
- **Production Ready**: Logging, validation, and retry mechanisms
## Installation
### Prerequisites
- Python 3.8 or higher
- Lark Open Platform account
- Lark Base (Bitable) app token
### Install Dependencies
```bash
cd python
pip install -r requirements.txt
```
## Environment Setup
### 1. Get Lark API Credentials
1. Go to [Lark Open Platform](https://open.larksuite.com/)
2. Create a new app or use an existing one
3. Get your `App ID` and `App Secret` from the app credentials page
4. Enable the following permissions:
- `bitable:app`
- `bitable:app:readonly`
### 2. Get Base App Token
1. Open your Lark Base in a browser
2. The URL will look like: `https://xxx.larksuite.com/base/bascnXXXXXXXXXXXX`
3. The `bascnXXXXXXXXXXXX` part is your app token
### 3. Create Environment File
Copy the example environment file and fill in your credentials:
```bash
cp .env.example .env
```
Edit `.env` with your credentials:
```env
LARK_APP_ID=cli_xxxxxxxxxxxxxxxx
LARK_APP_SECRET=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
LARK_APP_TOKEN=bascnxxxxxxxxxxxxxxxxxxxxxxx
LOG_LEVEL=INFO
```
## Quick Start
### Basic Usage
```python
from dotenv import load_dotenv
import os
from lark_client import LarkBaseClient, LarkConfig
# Load environment variables
load_dotenv()
# Initialize client
config = LarkConfig(
app_id=os.getenv('LARK_APP_ID'),
app_secret=os.getenv('LARK_APP_SECRET'),
log_level='INFO'
)
client = LarkBaseClient(config, rate_limit=10)
app_token = os.getenv('LARK_APP_TOKEN')
# Create a table
table_id = client.create_table(
app_token=app_token,
table_name="My Data",
default_view_name="All Records"
)
# Add fields
client.create_field(
app_token=app_token,
table_id=table_id,
field_name="Name",
field_type=1 # Text
)
client.create_field(
app_token=app_token,
table_id=table_id,
field_name="Revenue",
field_type=2 # Number
)
# Create records
record_id = client.create_record(
app_token=app_token,
table_id=table_id,
fields={
"Name": "Acme Corp",
"Revenue": 150000
}
)
# List all records
records = client.get_all_records(app_token, table_id)
print(f"Total records: {len(records)}")
```
## Field Types
Common Lark Base field types:
| Type | Code | Description |
|------|------|-------------|
| Text | 1 | Single-line text |
| Number | 2 | Numeric values |
| Single Select | 3 | Single choice from options |
| Multi Select | 4 | Multiple choices from options |
| DateTime | 5 | Date and time |
| Checkbox | 7 | Boolean value |
| User | 11 | User mention |
| Phone | 13 | Phone number |
| URL | 15 | Web link |
## Usage Examples
### Example 1: Basic CRUD Operations
```bash
python examples/basic_usage.py
```
This example demonstrates:
- Creating tables and fields
- Creating single and batch records
- Reading and filtering records
- Updating and deleting records
### Example 2: Sync TikTok Data
```bash
python examples/sync_tiktok_data.py
```
This example shows:
- Fetching data from external API (TikTok Ads)
- Transforming data to Lark Base format
- Syncing data with upsert logic
- Incremental updates based on timestamps
### Example 3: Batch Operations
```bash
python examples/batch_update.py
```
This example covers:
- Batch creating 100+ records
- Data validation before loading
- Bulk price updates
- Field mapping from external sources
- Data aggregation and reporting
## Advanced Features
### Data Manager
The `DataManager` class provides high-level operations:
```python
from data_manager import DataManager
data_manager = DataManager(client)
# Batch upsert with automatic create/update logic
result = data_manager.batch_upsert_records(
app_token=app_token,
table_id=table_id,
records=records,
key_field="ID"
)
print(f"Created: {result['created']}, Updated: {result['updated']}")
```
### Data Transformation
Transform data with custom functions:
```python
def transform_record(record):
return {
"Customer Name": record['name'].upper(),
"Revenue": record['revenue'] * 1.1, # Add 10% markup
"Category": record['category']
}
result = data_manager.transform_and_load(
app_token=app_token,
table_id=table_id,
source_records=external_data,
transformer=transform_record,
key_field="Customer Name",
parallel=True # Use parallel processing
)
```
### Data Validation
Validate records before loading:
```python
validation_result = data_manager.validate_records(
records=records,
required_fields=["Name", "Email"],
validators={
"Email": lambda x: '@' in str(x),
"Revenue": lambda x: x >= 0
}
)
# Only load valid records
if validation_result['valid_count'] > 0:
client.batch_create_records(
app_token,
table_id,
validation_result['valid_records']
)
```
### Field Mapping
Map external field names to Lark Base schema:
```python
field_mapping = {
"customer_id": "Customer ID",
"full_name": "Name",
"email_address": "Email",
"total_revenue": "Revenue"
}
mapped_record = data_manager.map_fields(
source_record=external_record,
field_mapping=field_mapping,
default_values={"Status": "Active"}
)
```
### Incremental Sync
Sync only changed records:
```python
from datetime import datetime, timedelta
last_sync = datetime.now() - timedelta(hours=24)
result = data_manager.sync_data_incremental(
app_token=app_token,
table_id=table_id,
data_source=fetch_external_data,
key_field="ID",
timestamp_field="updated_at",
last_sync_time=last_sync
)
```
### Data Aggregation
Aggregate data before loading:
```python
aggregated = data_manager.aggregate_data(
records=records,
group_by="Category",
aggregations={
"Revenue": sum,
"Orders": lambda x: len(x),
"Avg Price": lambda x: sum(x) / len(x)
}
)
```
## API Reference
### LarkBaseClient
#### Table Operations
- `list_tables(app_token, page_size=100)` - List all tables
- `create_table(app_token, table_name, default_view_name=None)` - Create a table
- `get_all_records(app_token, table_id, field_names=None)` - Get all records with pagination
#### Field Operations
- `list_fields(app_token, table_id, page_size=100)` - List all fields
- `create_field(app_token, table_id, field_name, field_type, property_dict=None)` - Create a field
#### Record Operations
- `list_records(app_token, table_id, page_size=100, page_token=None, ...)` - List records with pagination
- `create_record(app_token, table_id, fields)` - Create a single record
- `batch_create_records(app_token, table_id, records)` - Create multiple records
- `update_record(app_token, table_id, record_id, fields)` - Update a single record
- `batch_update_records(app_token, table_id, records)` - Update multiple records
- `delete_record(app_token, table_id, record_id)` - Delete a single record
### DataManager
#### Batch Operations
- `batch_upsert_records(app_token, table_id, records, key_field, batch_size=500)` - Upsert records
- `sync_data_incremental(app_token, table_id, data_source, key_field, ...)` - Incremental sync
#### Transformation
- `transform_and_load(app_token, table_id, source_records, transformer, ...)` - Transform and load data
- `map_fields(source_record, field_mapping, default_values=None)` - Map field names
#### Validation
- `validate_records(records, required_fields, validators=None)` - Validate records
#### Aggregation
- `aggregate_data(records, group_by, aggregations)` - Aggregate data
#### Export
- `export_to_dict(app_token, table_id, field_names=None)` - Export to dictionaries
#### Utilities
- `add_timestamp_fields(record)` - Add timestamp fields
- `convert_to_lark_datetime(dt)` - Convert Python datetime to Lark timestamp
- `convert_from_lark_datetime(timestamp_ms)` - Convert Lark timestamp to Python datetime
## Integration with Dashboards
Once you've loaded data into Lark Base using this SDK, you can create dashboards that automatically visualize it:
### Using the TypeScript SDK
```typescript
import { LarkDashboardClient, ChartBlockBuilder } from '@hypelab/hype-dash';
const client = new LarkDashboardClient({
apiKey: process.env.LARK_API_KEY
});
// Create a dashboard for your Python-synced data
const dashboardId = await client.createDashboard({
name: 'TikTok Analytics',
appToken: 'YOUR_APP_TOKEN'
});
// Add a chart using the table created via Python
const chart = ChartBlockBuilder.bar()
.dataSource('YOUR_APP_TOKEN', 'YOUR_TABLE_ID')
.xAxis({ fieldName: 'Campaign Name' })
.yAxis([{ fieldName: 'Spend', aggregation: AggregationType.SUM }])
.build();
await client.addBlock('YOUR_APP_TOKEN', dashboardId, chart);
```
### Workflow
1. **Python SDK** - Manage data (create tables, sync external data, batch updates)
2. **TypeScript SDK** - Create dashboards that visualize the data
3. **Automatic Updates** - When you update data via Python, dashboards update automatically
## Error Handling
The SDK includes comprehensive error handling:
```python
try:
record_id = client.create_record(app_token, table_id, fields)
except Exception as e:
logger.error(f"Failed to create record: {e}")
# Handle error appropriately
```
## Logging
Configure logging level in your environment:
```python
config = LarkConfig(
app_id=os.getenv('LARK_APP_ID'),
app_secret=os.getenv('LARK_APP_SECRET'),
log_level='DEBUG' # DEBUG, INFO, WARNING, ERROR
)
```
## Rate Limiting
The client includes automatic rate limiting:
```python
# Limit to 5 requests per second
client = LarkBaseClient(config, rate_limit=5)
```
## Best Practices
1. **Batch Operations**: Use batch operations for creating/updating multiple records
2. **Field Names**: Use `field_names` parameter to fetch only needed fields
3. **Validation**: Validate data before loading to prevent errors
4. **Timestamps**: Use milliseconds for DateTime fields
5. **Key Fields**: Always specify a unique key field for upsert operations
6. **Rate Limits**: Respect Lark API rate limits (default: 10 req/s)
7. **Logging**: Enable logging for debugging and monitoring
8. **Environment**: Never commit `.env` file with credentials
## Troubleshooting
### Common Issues
**Authentication Error**
- Verify `LARK_APP_ID` and `LARK_APP_SECRET` are correct
- Check app permissions in Lark Open Platform
**Table Not Found**
- Verify `LARK_APP_TOKEN` is correct
- Ensure table ID exists in the base
**Rate Limit Error**
- Reduce `rate_limit` parameter
- Add delays between operations
**Field Type Mismatch**
- Check field types match Lark Base schema
- Use correct field type codes
## Support
For issues and questions:
- GitHub Issues: https://github.com/hypelab/hype-dash/issues
- Email: dev@hypelab.com
## License
MIT License - see LICENSE file for details.