Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
9272864
OSF to use latest djelme
bodintsov Mar 20, 2026
590f7a2
fix test fails
bodintsov Mar 23, 2026
6ebdd8e
fix poetry issue
bodintsov Mar 23, 2026
fbe2a08
add connection
bodintsov Mar 23, 2026
ea78a15
remove connection, add proper setUp and tearDown
bodintsov Mar 24, 2026
0efd0b1
remove elasticsearch and elasticsearch-dsl
bodintsov Mar 25, 2026
684a83f
remove elasticsearch-dsl
bodintsov Mar 25, 2026
1151610
remove sleep() and refresh indices
bodintsov Mar 25, 2026
4649800
remove unused imports, comment out
bodintsov Mar 25, 2026
5ea0ed1
chore: bump djelme dependency
aaxelb Mar 25, 2026
af0dfae
Merge pull request #11644 from bodintsov/feature/add-djelme
aaxelb Mar 26, 2026
674f963
wip: es8 djelme records (migration targets)
aaxelb Mar 25, 2026
2e73161
add new metrics
bodintsov Mar 31, 2026
4b4a478
fix flake8
bodintsov Apr 6, 2026
d3b48e4
add tests, use new version of djelme, consolidate into OsfCountedUsag…
bodintsov Apr 8, 2026
e4bec9d
add imports to init, flake8
bodintsov Apr 9, 2026
ee515ef
fix test, imports, flake8
bodintsov Apr 9, 2026
ca60b58
add security, flake8, fixes, add to test-build.yml
bodintsov Apr 10, 2026
080daf6
test-build update
bodintsov Apr 10, 2026
fde32a4
test-build fix url
bodintsov Apr 10, 2026
e6da70b
test-build fix naming
bodintsov Apr 10, 2026
2b8a81c
update test
bodintsov Apr 11, 2026
6167778
add wait
bodintsov Apr 13, 2026
eb0a5d9
remove wait
bodintsov Apr 13, 2026
78ed96f
cleanup
bodintsov Apr 14, 2026
70cf5e2
add wait, downgrade djelme, flake8
bodintsov Apr 14, 2026
3e35fee
add elastic8
bodintsov Apr 14, 2026
a236342
fix test
bodintsov Apr 14, 2026
00b055b
timedepth constants
aaxelb Apr 14, 2026
dddc94e
tidy gh actions with yaml anchors, health checks
aaxelb Apr 14, 2026
46a934f
simplify local elasticsearch8 config
aaxelb Apr 14, 2026
49f9259
bump djelme to get fixes
aaxelb Apr 14, 2026
29839b9
tests passing with djelme es8
aaxelb Apr 14, 2026
619cac7
fix(test): patch check_index_template
aaxelb Apr 14, 2026
8cec095
uncomment autouse fixture
aaxelb Apr 14, 2026
c24430f
remove unnecessary loop
aaxelb Apr 14, 2026
cd32827
plac8 flake8
aaxelb Apr 14, 2026
db938be
remove unused local env vars
aaxelb Apr 14, 2026
52a2bc9
better use waffle switch ELASTICSEARCH_METRICS
aaxelb Apr 14, 2026
82de65b
mock check mock save
aaxelb Apr 14, 2026
b33280d
remove the override
bodintsov Apr 15, 2026
1cef7d3
fix failing test
bodintsov Apr 15, 2026
6c45a66
Merge pull request #11672 from bodintsov/feature/add-new-es8-metrics
aaxelb Apr 15, 2026
029647f
add background_migration queue (in the osf way)
aaxelb Apr 9, 2026
ac397e8
wip
aaxelb Apr 14, 2026
ef981e7
wip
aaxelb Apr 15, 2026
9ed70f3
quieter elastic logs
aaxelb Apr 15, 2026
be1ed2f
wip
aaxelb Apr 15, 2026
64aeeab
wip
aaxelb Apr 15, 2026
97cd5b7
wip
aaxelb Apr 16, 2026
7eba5cc
wip
aaxelb Apr 16, 2026
7d554b6
wip
aaxelb Apr 17, 2026
68b38ba
wip
aaxelb Apr 17, 2026
69daa87
wip
aaxelb Apr 21, 2026
da7910a
wip
aaxelb Apr 21, 2026
95b42e6
wip
aaxelb Apr 21, 2026
bac21a0
chore: "fix' quotes
aaxelb Apr 21, 2026
999dc86
fix: background migration task module
aaxelb Apr 21, 2026
d9f5380
fix: timestamp tz handling
aaxelb Apr 21, 2026
beb8548
fix: tests with djelme
aaxelb Apr 21, 2026
778f4b4
fix: pageview_info optional
aaxelb Apr 21, 2026
ee91384
fix: tests
aaxelb Apr 21, 2026
a65d6a5
fix: preprint metric conversion
aaxelb Apr 21, 2026
2059a56
fix: osf_shell
aaxelb Apr 21, 2026
c186373
per-deployment djelme index name prefix
aaxelb Apr 22, 2026
b06f6eb
Merge pull request #11699 from aaxelb/9706-metrics-migration
aaxelb Apr 23, 2026
e161f5d
better counted-usage autofill (and item_type iris)
aaxelb Apr 23, 2026
45d1e30
osf-admin migrate_osfmetrics_6to8
aaxelb Apr 24, 2026
2537561
/_/metrics/raw-es8_metrics/...
aaxelb Apr 24, 2026
4084a36
better 6to8 error handling
aaxelb Apr 24, 2026
ce85704
fewer osfmetrics indexes
aaxelb Apr 24, 2026
c858e7b
add es8 reports
bodintsov Apr 22, 2026
59330e8
flake8
bodintsov Apr 23, 2026
3bbace3
modify tests
bodintsov Apr 24, 2026
bd2e7d4
flake8
bodintsov Apr 24, 2026
25bdadd
flake8
bodintsov Apr 24, 2026
f3729bc
fix to pass tests
bodintsov Apr 24, 2026
3481f14
flake8
bodintsov Apr 24, 2026
0d3164b
tests improve
bodintsov Apr 27, 2026
08258b7
flake8
bodintsov Apr 27, 2026
cb5797b
better match elasticsearch_metrics changes
aaxelb Apr 27, 2026
c4fa5f7
fix: counted-usage with session-hour
aaxelb Apr 27, 2026
dce545d
Merge pull request #11702 from bodintsov/feature/save-data-metrics-bo…
aaxelb Apr 27, 2026
b33593e
Merge branch 'develop' into feature/9691-osfmetrics-migration
aaxelb Apr 27, 2026
f88ed2f
fix(staging): elastic hostname with ip url
aaxelb Apr 27, 2026
4358756
fix: s/check_metrics/djelme_backend_check
aaxelb Apr 27, 2026
76138f1
fix: /_/metrics/raw- passthru
aaxelb Apr 28, 2026
d9e0076
fix: s/short_name/addon_shortname
aaxelb Apr 28, 2026
cfa5085
fix: make unused fields optional
aaxelb Apr 28, 2026
80d2f0d
fix: item_type list in osfmetrics 6to8
aaxelb Apr 28, 2026
aca7447
fix: es8 Field.serialize with skip_empty
aaxelb Apr 28, 2026
f0ffadd
fix: some optional osfmetrics report fields
aaxelb Apr 28, 2026
5b06f6a
fix: 0 session counts
aaxelb Apr 28, 2026
fd59272
avoid duplicate moderation events
aaxelb Apr 28, 2026
8ebd570
fix(6to8): more relevant usage event count
aaxelb Apr 28, 2026
9bcc951
fix: mirror reg mod event to es8
aaxelb Apr 29, 2026
3c24a6b
fix: autofill referent
aaxelb Apr 29, 2026
4be53dd
fix? double-counted usage
aaxelb Apr 29, 2026
88a939a
renames for consistency and clarity
aaxelb Apr 29, 2026
ea6ac1e
fix: broken import
aaxelb Apr 29, 2026
093066e
fix: idempotent event migration
aaxelb Apr 29, 2026
df9d118
allow clearing migration targets
aaxelb Apr 29, 2026
655d9dc
fix: skip gv addons in storage_addon_usage
aaxelb Apr 30, 2026
ceb9409
fix: es6 usage count to migrate
aaxelb Apr 30, 2026
aa4025c
fix: monthly institution reporter
aaxelb May 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .docker-compose.env
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ INTERNAL_DOMAIN=http://192.168.168.167:5000/
API_DOMAIN=http://localhost:8000/
ELASTIC_URI=192.168.168.167:9200
ELASTIC6_URI=192.168.168.167:9201
ELASTIC8_URI=http://192.168.168.167:9202
ELASTIC8_USERNAME=elastic
OSF_DB_HOST=192.168.168.167
DB_HOST=192.168.168.167
REDIS_HOST=redis://192.168.168.167:6379
Expand Down
103 changes: 27 additions & 76 deletions .github/workflows/test-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,19 @@ jobs:
permissions:
checks: write
services:
postgres:
elasticsearch8: &ES8_SERVICE
image: elasticsearch:8.19.14
ports:
- 9202:9200
env:
discovery.type: single-node
xpack.security.enabled: false
options: >-
--health-cmd "curl -sf http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=30s"
--health-interval 10s
--health-timeout 30s
--health-retries 5
postgres: &POSTGRES_SERVICE
image: postgres
env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
Expand All @@ -54,6 +66,8 @@ jobs:
- uses: ./.github/actions/start-build
- name: Run tests
run: poetry run python3 -m invoke test-ci-addons --junit
env:
ELASTIC8_URI: http://localhost:9202
- name: Upload report
if: (success() || failure()) # run this step even if previous step failed
uses: ./.github/actions/gen-report
Expand All @@ -64,18 +78,7 @@ jobs:
permissions:
checks: write
services:
postgres:
image: postgres
env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Maps tcp port 5432 on service container to the host
- 5432:5432
postgres: *POSTGRES_SERVICE
steps:
- uses: actions/checkout@v6
- uses: ./.github/actions/start-build
Expand All @@ -91,25 +94,17 @@ jobs:
permissions:
checks: write
services:
postgres:
image: postgres
env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Maps tcp port 5432 on service container to the host
- 5432:5432
elasticsearch8: *ES8_SERVICE
postgres: *POSTGRES_SERVICE
steps:
- uses: actions/checkout@v6
- uses: ./.github/actions/start-build
- name: NVM & yarn install
run: poetry run python3 -m invoke assets --dev
- name: Run test
run: poetry run python3 -m invoke test-ci-api1-and-js --junit
env:
ELASTIC8_URI: http://localhost:9202
- name: Upload report
if: (success() || failure()) # run this step even if previous step failed
uses: ./.github/actions/gen-report
Expand All @@ -120,23 +115,15 @@ jobs:
permissions:
checks: write
services:
postgres:
image: postgres
env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Maps tcp port 5432 on service container to the host
- 5432:5432
elasticsearch8: *ES8_SERVICE
postgres: *POSTGRES_SERVICE
steps:
- uses: actions/checkout@v6
- uses: ./.github/actions/start-build
- name: Run tests
run: poetry run python3 -m invoke test-ci-api2 --junit
env:
ELASTIC8_URI: http://localhost:9202
- name: Upload report
if: (success() || failure()) # run this step even if previous step failed
uses: ./.github/actions/gen-report
Expand All @@ -147,19 +134,7 @@ jobs:
checks: write
needs: build-cache
services:
postgres:
image: postgres

env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Maps tcp port 5432 on service container to the host
- 5432:5432
postgres: *POSTGRES_SERVICE
steps:
- uses: actions/checkout@v6
- uses: ./.github/actions/start-build
Expand All @@ -175,19 +150,7 @@ jobs:
checks: write
needs: build-cache
services:
postgres:
image: postgres

env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Maps tcp port 5432 on service container to the host
- 5432:5432
postgres: *POSTGRES_SERVICE
mailhog:
image: mailhog/mailhog
ports:
Expand All @@ -208,19 +171,7 @@ jobs:
checks: write
needs: build-cache
services:
postgres:
image: postgres

env:
POSTGRES_PASSWORD: ${{ env.OSF_DB_PASSWORD }}
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Maps tcp port 5432 on service container to the host
- 5432:5432
postgres: *POSTGRES_SERVICE
steps:
- uses: actions/checkout@v6
- uses: ./.github/actions/start-build
Expand Down
26 changes: 25 additions & 1 deletion addons/base/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import waffle
from django.db import transaction
from django.contrib.contenttypes.models import ContentType
from elasticsearch import exceptions as es_exceptions
from elasticsearch6 import exceptions as es_exceptions
from rest_framework import status as http_status

from api.caching.tasks import update_storage_usage_with_size
Expand All @@ -34,6 +34,7 @@
from framework.flask import redirect
from framework.sentry import log_exception
from framework.transactions.handlers import no_auto_transaction
from osf.metrics.es8_metrics import OsfCountedUsageEvent
from website import settings
from addons.base import signals as file_signals
from addons.base.utils import format_last_known_metadata, get_mfr_url
Expand Down Expand Up @@ -691,6 +692,18 @@ def osfstoragefile_viewed_update_metrics(self, auth, fileversion, file_node):
version=fileversion.identifier,
path=file_node.path,
)
OsfCountedUsageEvent.record(
user_id=getattr(user, '_id', None),
item_osfid=resource._id,
action_labels=[
OsfCountedUsageEvent.ActionLabel.VIEW.value,
OsfCountedUsageEvent.ActionLabel.WEB.value,
],
# HACK: we don't have the user request, so fabricate a one-off session id
# (this means no double-click filtering for anonymous users (same as before)
# and potentially inflated "unique" sessionhour view counts)
client_session_id=str(uuid.uuid4()),
)
except es_exceptions.ConnectionError:
log_exception()

Expand Down Expand Up @@ -718,6 +731,17 @@ def osfstoragefile_downloaded_update_metrics(self, auth, fileversion, file_node)
version=fileversion.identifier,
path=file_node.path,
)
OsfCountedUsageEvent.record(
user_id=getattr(user, '_id', None),
item_osfid=resource._id,
action_labels=[
OsfCountedUsageEvent.ActionLabel.DOWNLOAD.value,
],
# HACK: we don't have the user request, so fabricate a one-off session id
# (this means no double-click filtering for anonymous users (same as before)
# and potentially inflated "unique" sessionhour view counts)
client_session_id=str(uuid.uuid4()),
)
except es_exceptions.ConnectionError:
log_exception()

Expand Down
5 changes: 3 additions & 2 deletions admin/management/urls.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from django.urls import re_path
from django.urls import re_path, path

from admin.management import views

Expand All @@ -21,5 +21,6 @@
re_path(r'^sync_notification_templates', views.SyncNotificationTemplates.as_view(),
name='sync_notification_templates'),
re_path(r'^remove_orcid_from_user_social', views.RemoveOrcidFromUserSocial.as_view(),
name='remove_orcid_from_user_social')
name='remove_orcid_from_user_social'),
path('migrate_osfmetrics_6to8', views.MigrateOsfmetrics6to8.as_view(), name='migrate_osfmetrics_6to8'),
]
23 changes: 23 additions & 0 deletions admin/management/views.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
from io import StringIO

from dateutil.parser import isoparse
from django.views.generic import TemplateView, View
from django.contrib import messages
from django.http import HttpResponse
from django.utils import timezone
from django.contrib.auth.mixins import PermissionRequiredMixin
from django.core.management import call_command

from osf.management.commands.manage_switch_flags import manage_waffle
from osf.management.commands.update_registration_schemas import update_registration_schemas
Expand Down Expand Up @@ -190,3 +193,23 @@ def post(self, request):
remove_orcid_from_user_social()
messages.success(request, 'Orcid from user social have been successfully removed.')
return redirect(reverse('management:commands'))


class MigrateOsfmetrics6to8(ManagementCommandPermissionView):
def post(self, request):
_command_kwargs = {
'no_setup': True,
'no_color': True,
'no_counts': request.POST.get('no_counts'),
'clear_state': request.POST.get('clear_state'),
'clear_es8_data': request.POST.get('clear_es8_data'),
'start': request.POST.get('start'),
'unchanged': request.POST.get('unchanged'),
'usage_reports': request.POST.get('usage_reports'),
'usage_events': request.POST.get('usage_events'),
}
_out_io = StringIO()
call_command('migrate_osfmetrics_6to8', **_command_kwargs, stdout=_out_io)
for _line in _out_io.getvalue().split('\n'):
messages.info(request, _line)
return redirect(reverse('management:commands'))
25 changes: 25 additions & 0 deletions admin/templates/management/commands.html
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,31 @@ <h4><u>Remove existing orcid info from user social</u></h4>
</nav>
</form>
</section>
<section>
<h4><u>migrate osf-metrics 6to8</u></h4>
<p>
view progress of the osf-metrics migration from elastic6 to elastic8 (or start it)
</p>
<form method="post"
action="{% url 'management:migrate_osfmetrics_6to8'%}"
style="display: flex; flex-direction: column;">
{% csrf_token %}
<label><input type="checkbox" name="no_counts"> no counts</label>
<label><input type="checkbox" name="start"> start tasks (caution)</label>
<label><input type="checkbox" name="clear_state"> reset migration start time (caution)</label>
<label><input type="checkbox" name="clear_es8_data"> clear es8 data (big caution)</label>
<fieldset>
(narrow types:
<label><input type="checkbox" name="unchanged"> unchanged events and reports</label>
<label><input type="checkbox" name="usage_events"> usage events</label>
<label><input type="checkbox" name="usage_reports"> usage reports</label>
)
</fieldset>
<nav>
<input class="btn btn-success" type="submit" value="Run" />
</nav>
</form>
</section>
</div>
</section>
{% endblock %}
8 changes: 4 additions & 4 deletions api/base/elasticsearch_dsl_views.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import datetime
import typing

import elasticsearch_dsl as edsl
import elasticsearch6_dsl as edsl
from rest_framework import generics, exceptions as drf_exceptions
from rest_framework.settings import api_settings as drf_settings
from api.base.settings.defaults import REPORT_FILENAME_FORMAT
Expand All @@ -23,7 +23,7 @@


class ElasticsearchListView(FilterMixin, JSONAPIBaseView, generics.ListAPIView, abc.ABC):
'''abstract view class using `elasticsearch_dsl.Search` as a queryset-analogue
'''abstract view class using `elasticsearch6_dsl.Search` as a queryset-analogue

builds a `Search` based on `self.get_default_search()` and the request's
query parameters for filtering, sorting, and pagination -- fetches only
Expand All @@ -36,7 +36,7 @@ class ElasticsearchListView(FilterMixin, JSONAPIBaseView, generics.ListAPIView,

@abc.abstractmethod
def get_default_search(self) -> edsl.Search | None:
'''the base `elasticsearch_dsl.Search` for this list, based on url path
'''the base `elasticsearch6_dsl.Search` for this list, based on url path

(common jsonapi query parameters will be considered automatically)
'''
Expand Down Expand Up @@ -95,7 +95,7 @@ def finalize_response(self, request, response, *args, **kwargs):
# (filtering handled in-view to reuse logic from FilterMixin)
filter_backends = ()

# note: because elasticsearch_dsl.Search supports slicing and gives results when iterated on,
# note: because elasticsearch6_dsl.Search supports slicing and gives results when iterated on,
# it works fine with default pagination

# override rest_framework.generics.GenericAPIView
Expand Down
25 changes: 21 additions & 4 deletions api/base/settings/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,10 +320,27 @@
HASHIDS_SALT = 'pinkhimalayan'

# django-elasticsearch-metrics
ELASTICSEARCH_DSL = {
'default': {
'hosts': osf_settings.ELASTIC6_URI,
'retry_on_timeout': True,
DJELME_BACKENDS = {
'osfmetrics_es6': {
'elasticsearch_metrics.imps.elastic6': {
'hosts': osf_settings.ELASTIC6_URI,
'retry_on_timeout': True,
},
},
'osfmetrics_es8': {
'elasticsearch_metrics.imps.elastic8': {
# passthru kwargs to elasticsearch8 connection constructor
'hosts': osf_settings.ELASTIC8_URI,
'ca_certs': osf_settings.ELASTIC8_CERT_PATH,
'basic_auth': (
(osf_settings.ELASTIC8_USERNAME, osf_settings.ELASTIC8_SECRET)
if osf_settings.ELASTIC8_SECRET is not None
else None
),
'ssl_assert_hostname': osf_settings.ELASTIC8_ASSERT_HOSTNAME,
# djelme-specific kwargs
'djelme_default_index_name_prefix': osf_settings.SHARE_PROVIDER_PREPEND,
},
},
}
# Store yearly indices for time-series metrics
Expand Down
Loading
Loading