3 Commits

Author SHA1 Message Date
geoffsee
b35e50577f target v2.0.0 of code review action 2025-06-30 18:16:00 -04:00
geoffsee
d077827c11 Update toakinize action version in code review workflow 2025-06-30 18:07:29 -04:00
geoffsee
312229c8d4 test code review action 2025-06-30 18:03:20 -04:00
3 changed files with 48 additions and 8 deletions

24
.github/workflows/code_review.yml vendored Normal file
View File

@@ -0,0 +1,24 @@
name: Code Review
permissions:
pull-requests: write
statuses: write
checks: write
contents: read
actions: read
on:
workflow_dispatch:
push:
branches: [ci-dev]
jobs:
code_review:
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: geoffsee/toakinize@v2.0.0
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

View File

@@ -1,21 +1,25 @@
# code-tokenizer
# toak
it's no joke
[![npm version](https://img.shields.io/npm/v/toak)](https://www.npmjs.com/package/toak)
![Tests](https://github.com/seemueller-io/code-tokenizer/actions/workflows/tests.yml/badge.svg)
![Tests](https://github.com/seemueller-io/toak/actions/workflows/tests.yml/badge.svg)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0.html)
## Overview
`toak` is a cli tool, named for phonetics, that processes git repository files, cleans code, redacts sensitive information, and generates a `prompt.md` with token counts using the Llama 3 tokenizer.
`toak` is an intentionally simple yet powerful tool that processes git repository files, cleans code, redacts sensitive information, and generates markdown documentation with token counts using the Llama 3 tokenizer.
```shell
$ cd your-git-repo
$ npx toak
```
![toak](https://github.com/seemueller-io/code-tokenizer/blob/471c2a359e342c0103d2074650afe1f1b2b5f71d/toak.jpg?raw=true)
![toak](https://github.com/seemueller-io/toak/blob/471c2a359e342c0103d2074650afe1f1b2b5f71d/toak.jpg?raw=true)
## Philosophy
1. _Human-first_ technologies for a better future.
2. If you don't like the name...good.
---
## Features
@@ -25,10 +29,14 @@ $ npx toak
- Redacts sensitive information (API keys, tokens, JWT, hashes)
- Counts tokens using llama3-tokenizer-js
- Supports nested .toak-ignore files
### Token Cleaning
- Removes single-line and multi-line comments
- Strips console.log statements
- Removes import statements
- Cleans up whitespace and empty lines
### Security Features
- Redacts API keys and secrets
- Masks JWT tokens
- Hides authorization tokens
@@ -37,7 +45,15 @@ $ npx toak
## Requirements
- npm/bun/yarn/pnpm
- Node.js (>=14.0.0)
- Git repository
- Bun runtime (for development)
## Installation
```bash
npm install toak
```
## Usage

View File

@@ -61,7 +61,7 @@
"@types/micromatch": "^4.0.9",
"@types/node": "^22.14.0",
"bun": "latest",
"bun-plugin-isolated-decl": "^0.2.8",
"bun-plugin-isolated-decl": "^0.1.10",
"eslint": "^9.24.0",
"globals": "^15.15.0",
"oxc-transform": "^0.44.0",