The ePuSta-Server provides usage statistics of electronic publications. It expects enriched log files in the epustalogfile format (produced by ePuSta-logfileparser), imports them into a Solr core and serves the aggregated statistics through two separate HTTP APIs.
This project's scope is the Solr core and the HTTP APIs on top of it.
The shell helpers in bin/ are deliberately single-file helpers: each
one operates on one log / one Solr import file / one source in the core.
Mass processing (iterating over directories, cron integration, bulk reimport) belongs in ePuSta_tools.
| Concern | Home project |
|---|---|
| Single-file parse / enrich / filter | ePuSta-logfileparser |
| Single-file import / single-source operations + HTTP APIs | this project |
| Mass / batch processing, cron, orchestration | ePuSta_tools |
- Linux
- Solr 7
- PHP 7.4+
curl,bash
-
Clone this repository:
git clone https://github.com/gbv/ePuSta-Server.git -
Create a core in Solr and copy the files from the
solr/directory into the core'sconf/directory. -
Copy the config template and adjust the values:
cp config/config.template config/configRelevant variables (consumed by the shell scripts in
bin/):Variable Purpose solrUrlBase URL of the Solr server (e.g. http://localhost:8983/solr/)solrCoreName of the Solr core epustaLogsDirectory containing the *.epusta.log[.gz]filessolrImportsDirectory where Solr import JSON files are written epustaServerBinPath to this project's bin/directory
The bin/ directory contains everything needed to fill and maintain the Solr
core. Each script supports -h/--help.
epustalogfile (*.epusta.log[.gz])
│
│ createSolrImport_all.sh
▼
Solr import JSON (*.json in $solrImports)
│
│ import_all.sh
▼
Solr core
| Script | Purpose |
|---|---|
createSolrImport.php |
Transforms a single *.epusta.log file into a Solr import JSON file. |
createSolrImport_all.sh |
Runs createSolrImport.php for all epustalogfiles below $epustaLogs. Supports .log and .log.gz, skips files where the target is already up to date (-f/--force to overwrite). |
import.sh |
Legacy one-shot import of solrImport.json via /opt/solr/bin/post. |
import_all.sh |
Batch-imports all Solr import JSON files in $solrImports. Uses listSourcesInCore.sh to compare the document count per source in Solr against the line count in the file and only reimports when the count differs or the source is new (-f/--force to reimport unconditionally). |
import_allMissed.php |
Older helper that imports only files not yet present in the core (no count check). |
listSourcesInCore.sh |
Lists all source values currently in the Solr core with their document counts (`--format text |
deleteSolrImportFromCore.sh |
Deletes all Solr documents whose source field matches a given Solr import JSON file. Used internally by import_all.sh. |
deleteSolrCore.sh |
Wipes the whole Solr core (<delete><query>*:*</query></delete>). |
Example: create the import file access-2019-12-01.json from a single log
file.
bin/createSolrImport.php --file=access-2019-12-01.epusta.log --level=PROD \
> access.2019-12-01.json
--level:
DEBUG– transform all log linesPROD– transform only log lines with a publication identifier
Run the same for the complete configured log directory:
bin/createSolrImport_all.sh
Manual import of a single file:
/opt/solr/bin/post -c $solrCore access.2019-12-01.json
Full batch import of everything below $solrImports, only reimporting what is
missing or has a count mismatch:
bin/import_all.sh
bin/listSourcesInCore.sh # list sources + counts
bin/listSourcesInCore.sh --format json # same, JSON for scripting
bin/deleteSolrImportFromCore.sh file.json # delete one source
bin/deleteSolrCore.sh # wipe the whole core
The ePuSta-Server ships two separate HTTP APIs. Both read from the same Solr core but address different use cases and have different code bases and contracts. They can be deployed side by side.
The original, lightweight endpoint that returns OpenAccess-Statistik (OAS) compatible reports. It is optimised for drop-in replacement of an OAS provider.
- Single entry point:
oas-api/index.php - Query-parameter driven:
do,from,until,granularity,content(counter,counter_abstract,robots,robots_abstract),identifier,summarized,addemptyrecords,jsonheader,informational,format. - Output: JSON (OAS report structure).
- Configuration:
oas-api/config.template.php→oas-api/config.php.
A newer, OpenAPI-first REST API built on top of the Slim framework. It replaces the ad-hoc parameter interface of the OAS API with a documented, versioned contract and is the API that ePuSta-Elements targets.
- Entry point:
rest-api/index.php - OpenAPI description:
rest-api/Epusta-1.0.x.openapi.yaml - Interactive documentation through a mounted Swagger-UI.
- Configuration:
config/config.template/config/config.php(shared with the rest of the project), plusrestApiDomainandrestApiBasePathfor the public URL rendered into the OpenAPI document.
New integrations and the upcoming web frontend target this API.