Skip to content

Commit 0193a73

Browse files
authored
Merge pull request #544 from dice-group/general-fixes
General fixes
2 parents 0433309 + cf12118 commit 0193a73

File tree

10 files changed

+123
-52
lines changed

10 files changed

+123
-52
lines changed

README.md

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ $E^+$ and $E^-$, learning [OWL Class expression](https://www.w3.org/TR/owl2-synt
1818

1919
$$\forall p \in E^+\ \mathcal{K} \models H(p) \wedge \forall n \in E^-\ \mathcal{K} \not \models H(n).$$
2020

21-
To tackle this supervised learning problem, ontolearn offers many symbolic, neuro-symbolic and deep learning based Learning algorithms:
21+
To address this supervised learning problem, OntoLearn provides a diverse suite of learning algorithms,
22+
including symbolic, neuro-symbolic, and deep learning-based approaches:
2223
- **TDL** → [Tree-based OWL Class Expression Learner for Large Graphs](https://rdcu.be/eJDlY)
2324
- **Drill** → [Neuro-Symbolic Class Expression Learning](https://www.ijcai.org/proceedings/2023/0403.pdf)
2425
- **EvoLearner** → [EvoLearner: Learning Description Logics with Evolutionary Algorithms](https://dl.acm.org/doi/abs/10.1145/3485447.3511925)
@@ -348,13 +349,22 @@ pytest -p no:warnings -x # Running 76 tests takes ~ 17 mins
348349
```
349350

350351

351-
352352
</details>
353353

354354
## References
355-
Currently, we are working on our manuscript describing our framework.
356355
If you find our work useful in your research, please consider citing the respective paper:
357356
```
357+
# Ontolearn
358+
@article{demir2025ontolearn,
359+
title={Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python},
360+
author={Demir, Caglar and Baci, Alkid and Kouagou, N'Dah Jean and Sieger, Leonie Nora and Heindorf, Stefan and Bin, Simon and Bl{\"u}baum, Lukas and Bigerl, Alexander and Ngomo, Axel-Cyrille Ngonga},
361+
journal={Journal of Machine Learning Research},
362+
volume={26},
363+
number={63},
364+
pages={1--6},
365+
year={2025}
366+
}
367+
358368
# TDL
359369
@InProceedings{10.1007/978-3-032-06066-2_29,
360370
author={Demir, Caglar and Yekini, Moshood and R{\"o}der, Michael and Mahmood, Yasir and Ngonga Ngomo, Axel-Cyrille},
@@ -368,17 +378,6 @@ pages={495--511},
368378
isbn={978-3-032-06066-2}
369379
}
370380
371-
# Ontolearn
372-
@article{demir2025ontolearn,
373-
title={Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python},
374-
author={Demir, Caglar and Baci, Alkid and Kouagou, N'Dah Jean and Sieger, Leonie Nora and Heindorf, Stefan and Bin, Simon and Bl{\"u}baum, Lukas and Bigerl, Alexander and Ngomo, Axel-Cyrille Ngonga},
375-
journal={Journal of Machine Learning Research},
376-
volume={26},
377-
number={63},
378-
pages={1--6},
379-
year={2025}
380-
}
381-
382381
# ROCES
383382
@inproceedings{kouagou2024roces,
384383
title = {ROCES: Robust Class Expression Synthesis in Description Logics via Iterative Sampling},
@@ -446,4 +445,4 @@ address="Cham"
446445
}
447446
```
448447

449-
In case you have any question, please contact: ```[email protected]``` or ```[email protected]```
448+
In case you have any question or feedback, please contact us: ```[email protected]``` or ```[email protected]```.

docs/usage/01_introduction.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -20,25 +20,25 @@ One of OntoLearn’s key contributions is its exclusive concept learning algorit
2020
Logics (DL). The library currently includes nine fully functional algorithms capable of learning complex concepts in DL.
2121
For further details and references, relevant research papers can be found [here](09_further_resources.md).
2222

23-
At the core of OntoLearn lies [Owlapy](https://github.com/dice-group/owlapy), a Python package inspired by the OWL API (its Java counterpart) and developed by
23+
At the core of OntoLearn lies [OWLAPY](https://github.com/dice-group/owlapy), a Python package inspired by the OWL API (its Java counterpart) and developed by
2424
the OntoLearn team. To enhance modularity, readability, and maintainability, we have separated Owlapy from Ontolearn into an
2525
independent repository. This modular approach allows Owlapy to serve not only as a framework for representing OWL 2
2626
entities, but also as a tool for ontology manipulation and reasoning.
2727

2828
---------------------------------------
2929

30-
**Ontolearn (including [owlapy](https://github.com/dice-group/owlapy) and [ontosample](https://github.com/alkidbaci/OntoSample)) can do the following:**
31-
32-
- **Use concept learning algorithms to generate hypotheses for classifying positive examples in a learning problem**.
33-
- **Use local datasets or datasets that are hosted on a triplestore server, for the learning task.**
34-
- Construct/Generate class expressions and evaluate them using different metrics.
35-
- Define learning problems.
36-
- Load/create/save ontologies in RDF/XML, OWL/XML.
37-
- Modify ontologies by adding/removing axioms.
38-
- Access individuals/classes/properties of an ontology (and a lot more).
39-
- Reason over an ontology.
40-
- Convenient functionalities like converting OWL class expressions to SPARQL or DL syntax.
41-
- Sample ontologies.
30+
**Ontolearn offers:**
31+
32+
- **Diverse concept learning algorithms to generate hypotheses for classifying positive examples in a learning problem**.
33+
- **Support for local datasets and datasets hosted on triplestore servers.**
34+
- Perform operation on generated class expressions like evaluating them using different metrics, verbalizing, etc.
35+
- Generate learning problems.
36+
- An extensible and modular architecture that allows easy integration of new algorithms and functionalities.
37+
38+
Via [OWLAPY](https://github.com/dice-group/owlapy) you can also perform ontology manipulations and reasoning.
39+
40+
[OntoSample](https://github.com/alkidbaci/OntoSample) is another library closely related with the task of concept learning, where sampling is used
41+
to accelerate the learning process.
4242

4343
------------------------------------
4444

docs/usage/03_examples.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ for str_target_concept, examples in settings['problems'].items():
7474
model.save_best_hypothesis(n=3, path='Predictions_{0}'.format(str_target_concept))
7575

7676
# Get top n hypotheses
77-
hypotheses = list(model.best_hypotheses(n=3))
77+
hypotheses = list(model.best_hypotheses(n=3, return_node=True))
7878

7979
# Use hypotheses as binary function to label individuals.
8080
predictions = model.predict(individuals=list(typed_pos | typed_neg),

docs/usage/09_further_resources.md

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,33 @@ Also check Owlapy's documentation [here](https://dice-group.github.io/owlapy/usa
2424

2525

2626
## Citing
27-
28-
Currently, we are working on our manuscript describing our framework.
2927
If you find our work useful in your research, please consider citing the respective paper:
3028

3129
```
30+
# Ontolearn
31+
@article{demir2025ontolearn,
32+
title={Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python},
33+
author={Demir, Caglar and Baci, Alkid and Kouagou, N'Dah Jean and Sieger, Leonie Nora and Heindorf, Stefan and Bin, Simon and Bl{\"u}baum, Lukas and Bigerl, Alexander and Ngomo, Axel-Cyrille Ngonga},
34+
journal={Journal of Machine Learning Research},
35+
volume={26},
36+
number={63},
37+
pages={1--6},
38+
year={2025}
39+
}
40+
41+
# TDL
42+
@InProceedings{10.1007/978-3-032-06066-2_29,
43+
author={Demir, Caglar and Yekini, Moshood and R{\"o}der, Michael and Mahmood, Yasir and Ngonga Ngomo, Axel-Cyrille},
44+
editor={Ribeiro, Rita P. and Pfahringer, Bernhard and Japkowicz, Nathalie and Larra{\~{n}}aga, Pedro and Jorge, Al{\'i}pio M. and Soares, Carlos and Abreu, Pedro H. and Gama, Jo{\~a}o},
45+
title={Tree-Based OWL Class Expression Learner over Large Graphs},
46+
booktitle={Machine Learning and Knowledge Discovery in Databases. Research Track},
47+
year={2026},
48+
publisher={Springer Nature Switzerland},
49+
address={Cham},
50+
pages={495--511},
51+
isbn={978-3-032-06066-2}
52+
}
53+
3254
# DRILL
3355
@inproceedings{demir2023drill,
3456
author = {Demir, Caglar and Ngomo, Axel-Cyrille Ngonga},
@@ -123,10 +145,10 @@ the project better. Find them in the folders
123145

124146
## Contribution
125147

126-
We try to keep documentation up to day to the latest changes, but sometimes we may
148+
We try to keep documentation up to date with the latest changes, but sometimes we may
127149
overlook some details or make mistakes. If you notice any of such things please let us know :).
128150
As for coding part, feel free to create a pull request and our developers will take a look
129-
on it. We appreciate your commitment.
151+
at it. We appreciate your commitment.
130152

131153
## Questions
132154

examples/concept_learning_with_evolearner.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060

6161
model.save_best_hypothesis(n=3, path='Predictions_{0}'.format(str_target_concept))
6262
# Get Top n hypotheses
63-
hypotheses = list(model.best_hypotheses(n=3))
63+
hypotheses = list(model.best_hypotheses(n=3, return_node=True))
6464
# Use hypotheses as binary function to label individuals.
6565
predictions = model.predict(individuals=list(typed_pos | typed_neg),
6666
hypotheses=hypotheses)

examples/concept_learning_with_tdl_and_triplestore_kb.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
from ontolearn.utils.static_funcs import save_owl_class_expressions
77
from owlapy.render import DLSyntaxObjectRenderer
88
# (1) Initialize Triplestore- Make sure that UPB VPN is on
9-
kb = TripleStore(url="http://dice-dbpedia.cs.upb.de:9080/sparql")
9+
kb = TripleStore(url="https://dbpedia.data.dice-research.org/sparql")
1010
# (2) Initialize a DL renderer.
1111
render = DLSyntaxObjectRenderer()
1212
# (3) Initialize a learner.
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import json
2+
3+
from owlapy.owl_individual import OWLNamedIndividual, IRI
4+
5+
from ontolearn.knowledge_base import KnowledgeBase
6+
from ontolearn.learners import TDL
7+
from ontolearn.learning_problem import PosNegLPStandard
8+
from owlapy.render import DLSyntaxObjectRenderer
9+
10+
kb = KnowledgeBase(path="../KGs/Mutagenesis/mutagenesis.owl")
11+
render = DLSyntaxObjectRenderer()
12+
model = TDL(kb)
13+
14+
15+
with open('../LPs/Mutagenesis/lps.json') as json_file:
16+
settings = json.load(json_file)
17+
p = set(settings['problems']['NotKnown']['positive_examples'])
18+
n = set(settings['problems']['NotKnown']['negative_examples'])
19+
typed_pos = set(map(OWLNamedIndividual, map(IRI.create, p)))
20+
typed_neg = set(map(OWLNamedIndividual, map(IRI.create, n)))
21+
lp = PosNegLPStandard(pos=typed_pos, neg=typed_neg)
22+
23+
h = model.fit(learning_problem=lp).best_hypotheses()
24+
str_concept = render.render(h)
25+
print("Concept:", str_concept)
26+

ontolearn/lp_generator/generate_data.py

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -29,19 +29,24 @@
2929

3030

3131
class LPGen:
32-
def __init__(self, kb_path=None, kb=None, storage_path=None, max_num_lps=1000, beyond_alc=False, depth=3, max_child_length=20, refinement_expressivity=0.2,
33-
downsample_refinements=True, sample_fillers_count=10, num_sub_roots=50, min_num_pos_examples=1):
32+
def __init__(self, kb_path=None, kb=None, storage_path=None, max_num_lps=1000, beyond_alc=False, depth=3,
33+
max_child_length=20, refinement_expressivity=0.2, downsample_refinements=True, sample_fillers_count=10,
34+
num_sub_roots=50, min_num_pos_examples=1, max_pos_neg_examples_per_lp=None):
3435
"""
35-
Args
36-
- kb_path: path to the owl file representing the knowledge base/ontology
37-
- storage_path: directory in which to store the data to be generated. Not the directory needs not to exists, it would be created automatically
38-
- max_num_lps: the maximum number of learning problems to store
39-
- beyond_alc: whether to generate learning problems in ALCHIQD
40-
- depth, max_child_length, refinement_expressivity, sample_fillers_count, num_sub_roots all refer to the size of the data (learning problems) to be generated
41-
- downsample_refinements: whether to downsample refinements in ExpressRefinement. If refinement_expressivity<1, this must be set to True
36+
Args:
37+
kb_path: path to the owl file representing the knowledge base/ontology
38+
kb: an instance of KnowledgeBase class. Can be used instead of kb_path.
39+
storage_path: directory in which to store the data to be generated. Not the directory needs not to exists, it would be created automatically
40+
max_num_lps: the maximum number of learning problems to store
41+
beyond_alc: whether to generate learning problems in ALCHIQD
42+
depth, max_child_length, refinement_expressivity, sample_fillers_count, num_sub_roots all refer to the size of the data (learning problems) to be generated
43+
downsample_refinements: whether to downsample refinements in ExpressRefinement. If refinement_expressivity<1, this must be set to True
4244
"""
43-
self.lp_gen = KB2Data(path=kb_path,knowledge_base=kb, storage_path=storage_path, max_num_lps=max_num_lps, beyond_alc=beyond_alc, depth=depth,
44-
max_child_length=max_child_length, refinement_expressivity=refinement_expressivity,
45-
downsample_refinements=downsample_refinements, sample_fillers_count=sample_fillers_count, num_sub_roots=num_sub_roots, min_num_pos_examples=min_num_pos_examples)
45+
self.lp_gen = KB2Data(path=kb_path,knowledge_base=kb, storage_path=storage_path, max_num_lps=max_num_lps,
46+
beyond_alc=beyond_alc, depth=depth, max_child_length=max_child_length,
47+
refinement_expressivity=refinement_expressivity,
48+
downsample_refinements=downsample_refinements, sample_fillers_count=sample_fillers_count,
49+
num_sub_roots=num_sub_roots, min_num_pos_examples=min_num_pos_examples,
50+
max_pos_neg_examples_per_lp = max_pos_neg_examples_per_lp)
4651
def generate(self):
4752
self.lp_gen.generate_descriptions().save_data()

ontolearn/lp_generator/helper_classes.py

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
2222
# SOFTWARE.
2323
# -----------------------------------------------------------------------------
24-
24+
from ontolearn.triple_store import TripleStore
2525
from tqdm import tqdm
2626
import random
2727
from ontolearn.knowledge_base import KnowledgeBase
@@ -72,7 +72,7 @@ class KB2Data:
7272
def __init__(self, path=None, storage_path=None, max_num_lps=1000, beyond_alc=False, depth=3,
7373
max_child_length=20, refinement_expressivity=0.2,
7474
downsample_refinements=True, sample_fillers_count=10, num_sub_roots=50,
75-
min_num_pos_examples=1,knowledge_base=None):
75+
min_num_pos_examples=1,knowledge_base=None, max_pos_neg_examples_per_lp=None):
7676
"""
7777
Args
7878
- kb_path: path to the owl file representing the knowledge base/ontology
@@ -83,8 +83,19 @@ def __init__(self, path=None, storage_path=None, max_num_lps=1000, beyond_alc=Fa
8383
- depth, refinement_expressivity, sample_fillers_count, num_sub_roots all refer to the size of the data (learning problems) to be generated
8484
- downsample_refinements: whether to downsample refinements in ExpressRefinement. If refinement_expressivity<1, this must be set to True
8585
"""
86-
self.path = path
86+
if path and knowledge_base:
87+
assert path == knowledge_base.path, "Path argument and knowledge base's path do not match"
88+
if path:
89+
self.path = path
90+
elif isinstance(knowledge_base, KnowledgeBase):
91+
self.path = knowledge_base.path
92+
elif isinstance(knowledge_base, TripleStore):
93+
self.path = None
94+
else:
95+
raise ValueError("No knowledge base provided or path provided. Please provide a value to 'path' or "
96+
"'knowledge_base' arguments")
8797
if storage_path is None:
98+
assert self.path is not None, "Storage path must be provided. Please provide a value to 'storage_path'"
8899
self.storage_path = f'{self.path[:self.path.rfind("/")]}/LPs/'
89100
else:
90101
self.storage_path = storage_path
@@ -96,7 +107,14 @@ def __init__(self, path=None, storage_path=None, max_num_lps=1000, beyond_alc=Fa
96107
self.kb = KnowledgeBase(path=path)
97108
else:
98109
self.kb = self.knowledge_base
99-
self.num_examples = self.find_optimal_number_of_examples()
110+
if max_pos_neg_examples_per_lp:
111+
count = self.kb.individuals_count()
112+
assert max_pos_neg_examples_per_lp <= count, \
113+
("The maximum number of examples per learning problem cannot be greater "
114+
"than the number of individuals in the knowledge base")
115+
self.num_examples = max_pos_neg_examples_per_lp
116+
else:
117+
self.num_examples = self.find_optimal_number_of_examples()
100118
self.min_num_pos_examples = min_num_pos_examples
101119
atomic_concepts = frozenset(self.kb.ontology.classes_in_signature())
102120
self.atomic_concept_names = frozenset(

ontolearn/triple_store.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,8 @@
4343
)
4444
from owlapy.owl_datatype import OWLDatatype
4545
from owlapy.owl_individual import OWLNamedIndividual
46-
from owlapy.owl_literal import OWLLiteral, BooleanOWLDatatype, DoubleOWLDatatype, NUMERIC_DATATYPES, TIME_DATATYPES
46+
from owlapy.owl_literal import OWLLiteral, BooleanOWLDatatype, DoubleOWLDatatype, NUMERIC_DATATYPES, TIME_DATATYPES, \
47+
NonNegativeIntegerOWLDatatype
4748
from owlapy.owl_ontology import OWLOntologyID
4849
from owlapy.abstracts import AbstractOWLOntology, AbstractOWLReasoner
4950
from owlapy.owl_property import (

0 commit comments

Comments
 (0)