0% found this document useful (0 votes)
27 views

C++ - Cycles in Family Tree Software - Stack Overflow

The software should relax data assertions by changing errors to warnings and allowing the user to add relationships anyway. This resolves issues without removing assertions while still alerting users to potential mistakes.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

C++ - Cycles in Family Tree Software - Stack Overflow

The software should relax data assertions by changing errors to warnings and allowing the user to add relationships anyway. This resolves issues without removing assertions while still alerting users to potential mistakes.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Cycles in family tree software

Asked 12 years, 10 months ago Modified 8 years, 8 months ago Viewed 290k times

1590
Locked. This question and its answers are locked because the question is off-topic but has historical
votes
significance. It is not currently accepting new answers or interactions.

I am the developer of some family tree software (written in C++ and Qt). I had no problems until one of
my customers mailed me a bug report. The problem is that the customer has two children with their
own daughter, and, as a result, he can't use my software because of errors.

Those errors are the result of my various assertions and invariants about the family graph being
processed (for example, after walking a cycle, the program states that X can't be both father and
grandfather of Y).

How can I resolve those errors without removing all data assertions?

c++ graph cycle assertions family-tree

Share edited Aug 13, 2015 at 23:13 asked May 28, 2011 at 18:39
Nathaniel Ford Partick Höse
20.9k 20 94 104 11.1k 3 15 9

220 You should obviously write your software with Ray Stevens' song in mind. – Peter K. May 28, 2011 at 19:00

30 If you trace your family tree backwards far enough, you will hit this problem far more often than you would like.
Abandoning the tree representation may be painful but would ultimately be more correct. – Thomas Jun 1, 2011
at 4:56

55 You shouldn't add assertions for unlikely things, only impossible things. Cycles are the obvious things that aren't
possible in a family tree graph... no one can be his own ancestor via any method. These other assertions are
just bogus and should be removed. – pgod Jun 1, 2011 at 5:08

44 This is not at all a silly question in the world of pet breeding. Daughter to father, mother to son, sister to brother,
grandchildren to grandparents is standard technique there, and pet breeders need family tree software too.
"Pure-bred" my ¤%#&. – kaleissin Jun 1, 2011 at 9:54

31 Marrying first cousins was very common in Victorian England, especially among the upper classes (it was an
excellent way to keep money within the family). Charles Darwin, for instance, married his first cousin, Emma
Wedgwood. Any family tree software needs to support situations like this. – rtperson Jun 1, 2011 at 15:19

Comments disabled on deleted / locked posts / reviews |

18 Answers Sorted by: Highest score (default)

726 It seems you (and/or your company) have a fundamental misunderstanding of what a family tree is
supposed to be.
votes

Let me clarify, I also work for a company that has (as one of its products) a family tree in its portfolio, and
we have been struggling with similar problems.
The problem, in our case, and I assume your case as well, comes from the GEDCOM format that is
extremely opinionated about what a family should be. However this format contains some severe
misconceptions about what a family tree really looks like.

GEDCOM has many issues, such as incompatibility with same sex relations, incest, etc... Which in real
life happens more often than you'd imagine (especially when going back in time to the 1700-1800).

We have modeled our family tree to what happens in the real world: Events (for example, births,
weddings, engagement, unions, deaths, adoptions, etc.). We do not put any restrictions on these, except
for logically impossible ones (for example, one can't be one's own parent, relations need two individuals,
etc...)

The lack of validations gives us a more "real world", simpler and more flexible solution.

As for this specific case, I would suggest removing the assertions as they do not hold universally.

For displaying issues (that will arise) I would suggest drawing the same node as many times as needed,
hinting at the duplication by lighting up all the copies on selecting one of them.

Share edited Mar 6, 2014 at 14:47 answered Jun 1, 2011 at 8:25


Bert Goethals
7,877 2 21 32

32 This looks like the right approach, and it's easy enough to extend to detect more complex problems. You can work
out a set of "A happened before B" relationships between events. For example, that a person was born before any
other events involving them. This is a directed graph. You could then check that the graph contains no cycles. See
this question on StackOverflow. This should be ok until time travel is invented. – Paul Harrison Jun 1, 2011 at 9:26

41 @paul-harrison If it where only that simple. In older records (even new ones) there are date inconsistencies.
Baptism before birth, multiple birth records etc... So to an extent, in official records, there is time travel. We allow
this inconsistent data. We allow users to indicate what the application should consider "the" birth record in case of
duplicates. And we'll indicate broken timelines if found. – Bert Goethals Jun 1, 2011 at 10:24

38 @ben-voigt GEDCOM is a format created by the The Church of Jesus Christ of Latter-day Saints. The specification
clearly states that marriage (MARR) is to be between men and women. For same sex marriage or incest the ASSO
tag should be used (ASSOCIATES), also used to indicate friendship or being neighbours. It is clear the same sex
marriage is second class relation within this spec. A more neutral spec would not demand male female relations.
– Bert Goethals Jun 2, 2011 at 21:29

1 @Bert Goethals: You are confusing GEDCOM with certain programs that do not support same-sex marriage (PAF,
Legacy). GEDCOM does not preclude constructs such as "0 @F1@ FAM/1 HUSB @I1@/1 HUSB @I2@", and
thus supports same-sex marriages if your software chooses to. – Pierre Oct 17, 2014 at 11:20

1 @Pierre You can cheat the system indeed. This is directly from the 5.5.1 docs: "MARR {MARRIAGE}: = A legal,
common-law, or customary event of creating a family unit of a man and a woman as husband and wife."
(homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gcappa.htm) As you can see, no same sex marriage
here. – Bert Goethals Oct 29, 2014 at 11:45

1 @Bert Goethals: Here is a test which you can easily perform yourself: [1] Create a same-sex marriage in a Family
Tree Maker file; [2] export to GEDCOM; [3] Import the GEDCOM file into RootsMagic. The same-sex marriage is
preserved (vg. both partners are still male). How did that happen? Don't confuse what the spec. recommends,
versus what is actually possible. – Pierre Oct 30, 2014 at 13:41

@TylerH I did mean at least. Also any "relation" could be expressed as a multiple of pairs. Eg: a threesome can be
expressed as 2 relations per individual. – Bert Goethals Jun 11, 2015 at 8:16

@Pierre My post is about the Spec; not about what other applications try to achieve by breaking the spec. A spec
"specifies" not recommends. The fact that implementers are breaking the spec indicates that the spec is indeed
flawed. – Bert Goethals Jun 11, 2015 at 8:18
563 Relax your assertions.

votes
Not by changing the rules, which are mostly likely very helpful to 99.9% of your customers in catching
mistakes in entering their data.

Instead, change it from an error "can't add relationship" to a warning with an "add anyway".

Share answered May 28, 2011 at 19:20


Ben Voigt
281k 44 426 728

143 When encountering a very unlikely situation, that is, one where a user would usually only do it by mistake, it is a
good idea to show the user a warning. That's good feedback. But then let the user go ahead if they are really sure
they want to. So I think this is a good answer, even if it doesn't get into the nuts and bolts of how. – thomasrutter
Jun 1, 2011 at 5:53

15 Good answer! I just wonder, how this kind of software will handle "I am my own grandpa" (youtube.com/watch?
v=eYlJH81dSiw) situation? – Zaur Nasibov Jun 1, 2011 at 8:44

4 This is not really an answer, because I think the problem comes from actually traversing the tree? However, it is a
good suggestion. – bdwakefield Jun 1, 2011 at 11:36

3 @bdwakefield: The question was "How do I resolve these errors, without removing all data assertions?" I believe
I've answered that. – Ben Voigt Jun 1, 2011 at 13:22

2 @Ben It depends on what the assertions are for. If they prevent infinite loops or fatal errors from happening, then
you are effectively suggesting to remove the assertions. If they are just there to warn a user of a potential mistake,
then your answer is a good one. – rm999 Jun 1, 2011 at 18:48

@rm999: The question is pretty clear that the assertions are enforced after traversing the data structure.
– Ben Voigt Jun 1, 2011 at 19:10

1 @Ben I think we are reading the question differently. I interpret "cycle" to mean a cycle in the graph (which would
mean during traversal). This makes sense in the context of the question: OP assumes a tree structure, and a child
having the same father and grandfather is a loop and hence not a tree. Many tree traversal algorithms would fail
on a graph with loops, so if my reading of the question is correct bdwakefield has an entirely legitimate point.
– rm999 Jun 1, 2011 at 19:49

224 Here's the problem with family trees: they are not trees. They are directed acyclic graphs or DAGs. If I
understand the principles of the biology of human reproduction correctly, there will not be any cycles.
votes

As far as I know, even the Christians accept marriages (and thus children) between cousins, which will
turn the family tree into a family DAG.

The moral of the story is: choose the right data structures.

Share edited Jun 1, 2011 at 14:07 answered Jun 1, 2011 at 9:58


Michael Hackner exDM69
8,635 2 28 29 2,495 2 15 10

7 It would need a further restriction of every node having 1 or 2 maximum nodes pointing to it for in vitro and sexual
reproduction. Although to be more true to real life, you might allow multiple dashed lines for uncertain descendancy
on the father side (it's always clear who the mother is, but only DNA testing can insure who the father is, and that's
rarely done even today), or even for both is adoption is taken into account. – manixrock Jun 1, 2011 at 11:16

7 @manixrock - since this question is about rare cases, i would like to assert that is not always clear who the mother
is. adoptions, abandoned babies, surrogate moms, etc can all complicate matters. – Peter Recore Jun 1, 2011 at
14:26

9 It's not necessarily acyclic, is it? Man-marries-grandmother. – Ed Ropple Jun 3, 2011 at 1:19

13 Man marrying his on grandmother will not make himself his own grandfather and adding a cycle. If they have
children, it will be a non-cycling regular graph edge. – exDM69 Jun 7, 2011 at 9:42

11 It's actually TWO ADGs. There is the parentage graph and the legal relationship graph. Usually the same, but
divergent more than one might expect. – JSacksteder Oct 4, 2011 at 20:38

1 Generally speaking there -will- be cycles. Just not cycles of the "parent of" type of relationship. There can very well
be cycles in relationships generally, such as "married to", especially when you consider that these relationships
change over time. – Agrajag Mar 13, 2012 at 13:43

2 Assuming perfect knowledge and biology, there won't be any cycles in biological ancestry, but in the real world
genealogy is full of "perhaps"-links. manxrock above, for example, asserts that it's always clear who the mother is.
Which is not true in the real world. Yes, today DNA-testing can answer that question, but how are you going to get
your relative from 5 generations back DNA-tested ? – Agrajag Apr 13, 2012 at 7:21

115 I guess that you have some value that uniquely identifies a person on which you can base your checks.

votes
This is a tricky one. Assuming you want to keep the structure a tree, I suggest this:

Assume this: A has kids with his own daughter.

A adds himself to the program as A and as B . Once in the role of father, let's call it boyfriend.

Add a is_same_for_out() function which tells the output generating part of your program that all links
going to B internally should be going to A on presentation of data.

This will make some extra work for the user, but I guess IT would be relatively easy to implement and
maintain.

Building from that, you could work on code synching A and B to avoid inconsistencies.

This solution is surely not perfect, but is a first approach.

Share edited Dec 27, 2011 at 19:48 answered May 28, 2011 at 18:50
Peter Mortensen Eduard Thamm
31k 22 109 132 942 1 6 7

9 Probably such "proxy" nodes are indeed suitable solution. However I have no idea how can those be put in user
interface without offending user. I can tell you that writing software that deals with real people (especially your
customers) is not easy. – Partick Höse May 28, 2011 at 19:14

6 It never ends - B's new son will be his own uncle. I would consider a full refund for the program! – Bo Persson May
28, 2011 at 19:33

3 @Will A: And then realizes he is also his own mother, and recruits his younger self into the time agency? – Null Set
May 28, 2011 at 19:48

2 Duplication (and syncing) of data within one system is bad practice. It indicates that the solution is sub optimal and
should be reconsidered. If creating extra (duplicate) nodes would be needed, indicate it as a proxy and delegate the
data reads and writes to the original node. – Bert Goethals Jun 1, 2011 at 10:44

84 You should focus on what really makes value for your software. Is the time spent on making it work for
ONE consumer worth the price of the license ? Likely not.
votes
I advise you to apologize to this customer, tell him that his situation is out of scope for your software and
issue him a refund.

Share edited Jun 1, 2011 at 9:55 answered Jun 1, 2011 at 8:51


christopheml
2,455 17 25

3 Very true. But also weigh other potential problems with similar troubles others have brought up. – Prof. Falken Jun 1,
2011 at 9:56

2 Of course. The reasoning is : if it's a rare edge case on a non-critical application, you are not require to fix or
implement anything. If it's really hurting your users, there's value in working on it. – christopheml Jun 1, 2011 at 11:24

10 Probably everybody has some case of incest somewhere in his/her ancestry. So you'll hit that bump if one digs family
history (too) deep. – datenwolf Jul 13, 2011 at 14:32

1 Making genealogy tree of some weird situation (inbreed royalty, Fritzl etc) is valid use of software. – Bulwersator Jun
7, 2013 at 6:03

1 A family tree software that won't allow second cousins to marry is useless. Nearly all families has atleast one case of
this. Which is why I think the original example is made up for effect. – Fuzzy76 Jan 26, 2015 at 10:04

79 You should have set up the Atreides family (either modern, Dune, or ancient, Oedipus Rex) as a testing
case. You don't find bugs by using sanitized data as a test case.
votes

Share edited Dec 27, 2011 at 19:58 answered Jun 1, 2011 at 16:10
Peter Mortensen user779752
31k 22 109 132 791 4 2

2 Sadly, way too many people first think of 'ok' data instead of the edge cases that break their systems. – sjas Dec 24,
2012 at 11:14

59 This is one of the reasons why languages like "Go" do not have assertions. They are used to handle
cases that you probably didn't think about, all too often. You should only assert the impossible, not
votes
simply the unlikely. Doing the latter is what gives assertions a bad reputation. Every time you type
assert( , walk away for ten minutes and really think about it.

In your particularly disturbing case, it is both conceivable and appalling that such an assertion would be
bogus under rare but possible circumstances. Hence, handle it in your app, if only to say "This software
was not designed to handle the scenario that you presented".

Asserting that your great, great, great grandfather being your father as impossible is a reasonable thing to
do.

If I was working for a testing company that was hired to test your software, of course I would have
presented that scenario. Why? Every juvenile yet intelligent 'user' is going to do the exact same thing and
relish in the resulting 'bug report'.

Share edited Jun 2, 2011 at 7:47 answered Jun 1, 2011 at 6:10


user50049
5 Agree with 'when to use assertions' argument; don't see how it relates to 'some languages have asserts, Go doesn't.'
– phooji Jun 1, 2011 at 8:02

@Tim Post If you know it's impossible, then why bother asserting it? – Arlen Jun 1, 2011 at 8:46

2 @Red Hue - sometimes compilers make the impossible ... possible. Some versions of gcc think -10 == 10 in the
abs() implementation. – user50049 Jun 1, 2011 at 9:12

2 @Red Hue: The whole point of assertions is to document and test conditions that should always be true (or false). It
helps keep you (and others) from "fixing" things in such a way that those impossible cases arise, as then they'd
explicitly (rather than subtly) break the app. If there's a valid reason for an "impossible" case to appear, then you've
asserted too much. – cHao Jun 1, 2011 at 9:17

1 @cHao @Tim Post I'm just trying to understand why Go not having assertions is a good thing since most of you
agree that assertion is important to have. – Arlen Jun 1, 2011 at 9:24

1 @Red Hue: I imagine that point of view is based on cases like the original question, where assertions are being a bit
abused. Asserting too much (or too little) is more common than many people will realize, because people rarely learn
to use assertions properly. It's also a dev-time thing, which often won't be tested in production anyway, so some
would conclude that the best way to handle it is to get rid of assertions altogether. I disagree, but have some
sympathy for the reasoning behind it. – cHao Jun 1, 2011 at 9:35

1 If a language has assertions or not built in will hardly much impact on if developers will use an assertion pattern, or
not, in their code. Many I am sure even do it without knowing that it's called an assertion... – Prof. Falken Jun 1, 2011
at 12:37

5 Having assertions (or assertion-like code) is irrelevant. Code in languages like Go can and will make assumptions
about the structure of data; it just can't document and enforce those assumptions with assertions. Bottom line: the
application has a bug. – Tommy McGuire Jun 1, 2011 at 20:59

This question has nothing to do with language features. – gvd Mar 4, 2012 at 18:16

41 I hate commenting on such a screwed up situation, but the easiest way to not rejigger all of your invariants
is to create a phantom vertex in your graph that acts as a proxy back to the incestuous dad.
votes

Share answered May 28, 2011 at 18:55


Sean
10k 4 40 43

37 So, I've done some work on family tree software. I think the problem you're trying to solve is that you need

votes
to be able to walk the tree without getting in infinite loops - in other words, the tree needs to be acyclical.

However, it looks like you're asserting that there is only one path between a person and one of their
ancestors. That will guarantee that there are no cycles, but is too strict. Biologically speaking,
descendancy is a directed acyclic graph (DAG). The case you have is certainly a degenerate case, but
that type of thing happens all the time on larger trees.

For example, if you look at the 2^n ancestors you have at generation n, if there was no overlap, then you'd
have more ancestors in 1000 AD than there were people alive. So, there's got to be overlap.

However, you also do tend to get cycles that are invalid, just bad data. If you're traversing the tree, then
cycles must be dealt with. You can do this in each individual algorithm, or on load. I did it on load.

Finding true cycles in a tree can be done in a few ways. The wrong way is to mark every ancestor from a
given individual, and when traversing, if the person you're going to step to next is already marked, then cut
the link. This will sever potentially accurate relationships. The correct way to do it is to start from each
individual, and mark each ancestor with the path to that individual. If the new path contains the current
path as a subpath, then it's a cycle, and should be broken. You can store paths as vector<bool> (MFMF,
MFFFMF, etc.) which makes the comparison and storage very fast.

There are a few other ways to detect cycles, such as sending out two iterators and seeing if they ever
collide with the subset test, but I ended up using the local storage method.

Also note that you don't need to actually sever the link, you can just change it from a normal link to a
'weak' link, which isn't followed by some of your algorithms. You will also want to take care when choosing
which link to mark as weak; sometimes you can figure out where the cycle should be broken by looking at
birthdate information, but often you can't figure out anything because so much data is missing.

Share edited May 26, 2014 at 12:59 answered Jun 1, 2011 at 18:39
tfinniga
6,743 3 33 39

Careful about those assumptions; one male and one female parent isn't a given when people adapt, or lesibans who
consider themselves as parents, in the near future they may even be able to really be biologically the parents, atleast
of girls. For that matter, if we apply dolly to humans, even the assumption "a person has two distinct parents" is out.
– Agrajag Mar 13, 2012 at 13:47

1 @Agrajag, yes that's why I specified "biologically speaking" for the cycle detection. Even biologically, there are lots of
possible issues, like surrogate mothers and artificial insemination. If you also allow adoptions and other non-
biological methods for defining parents, then it's possible to have a valid true cycle in a tree - for example, maybe
someone adopts their grandparent when they get old and are no longer able to take care of themselves. Making
assumptions about people's family life is always complicated. But when writing software you need to make some
assumptions.. – tfinniga Apr 3, 2012 at 23:12

36 Another mock serious answer for a silly question:


votes
The real answer is, use an appropriate data structure. Human genealogy cannot fully be expressed using
a pure tree with no cycles. You should use some sort of graph. Also, talk to an anthropologist before going
any further with this, because there are plenty of other places similar errors could be made trying to model
genealogy, even in the most simple case of "Western patriarchal monogamous marriage."

Even if we want to ignore locally taboo relationships as discussed here, there are plenty of perfectly legal
and completely unexpected ways to introduce cycles into a family tree.

For example: http://en.wikipedia.org/wiki/Cousin_marriage

Basically, cousin marriage is not only common and expected, it is the reason humans have gone from
thousands of small family groups to a worldwide population of 6 billion. It can't work any other way.

There really are very few universals when it comes to genealogy, family and lineage. Almost any strict
assumption about norms suggesting who an aunt can be, or who can marry who, or how children are
legitimized for the purpose of inheritance, can be upset by some exception somewhere in the world or
history.

Share answered Jun 1, 2011 at 14:12


user437212

9 Your comment made me think of polygamy. Genealogy software that only models sexual reproduction may require a
name attached to the sperm and the egg but broader definitions of family structure do not. – Steve Kalemkiewicz Jun
2, 2011 at 20:12
Genealogy software will often allow more than one spouse in the model. How you display the model in the view
varies widely, even within one program, depending on the "mode" that has been provided. – Todd Hopkinson Mar 30,
2012 at 20:16

20 Potential legal implications aside, it certainly seems that you need to treat a 'node' on a family tree as a
predecessor-person rather than assuming that the node can be the-one-and-only person.
votes

Have the tree node include a person as well as the successors - and then you can have another node
deeper down the tree that includes the same person with different successors.

Share edited May 28, 2011 at 18:56 answered May 28, 2011 at 18:49
Will A
24.9k 5 50 61

13 A few answers have shown ways to keep the assertions/invariants, but this seems like a misuse of
assertions/invariant. Assertions are to make sure something that should be true is true, and invariants are
votes
to make sure something that shouldn't change doesn't change.

What you're asserting here is that incestuous relationships don't exist. Clearly they do exist, so your
assertion is invalid. You can work around this assertion, but the real bug is in the assertion itself. The
assertion should be removed.

Share answered Jun 1, 2011 at 19:55


kerkeslager
1,374 4 17 34

8 Your family tree should use directed relations. This way you won't have a cycle.
votes
Share edited Dec 27, 2011 at 19:52 answered Jun 1, 2011 at 12:22
Peter Mortensen Patrick Cornelissen
31k 22 109 132 8,018 6 49 73

5 Genealogical data is cyclic and does not fit into an acyclic graph, so if you have assertions against cycles
you should remove them.
votes

The way to handle this in a view without creating a custom view is to treat the cyclic parent as a "ghost"
parent. In other words, when a person is both a father and a grandfather to the same person, then the
grandfather node is shown normally, but the father node is rendered as a "ghost" node that has a simple
label like ("see grandfather") and points to the grandfather.

In order to do calculations you may need to improve your logic to handle cyclic graphs so that a node is
not visited more than once if there is a cycle.

Share answered Dec 12, 2012 at 19:08


Tyler Durden
11.3k 10 68 131

4 The most important thing is to avoid creating a problem , so I believe that you should use a direct
relation to avoid having a cycle.
votes As @markmywords said, #include "fritzl.h".

Finally I have to say recheck your data structure . Maybe something is going wrong over there (maybe a
bidirectional linked list solves your problem).

Share edited Jul 25, 2011 at 10:26 answered Jun 6, 2011 at 6:12
Peter Mortensen Nasser Hadjloo
31k 22 109 132 12.4k 15 70 101

4 Assertions don't survive reality


votes
Usually assertions don't survive the contact with real world data. It's a part of the process of software
engineering to decide, with which data you want to deal and which are out of scope.

Cyclic family graphs


Regarding family "trees" (in fact it are full blown graphs, including cycles), there is a nice anecdote:

I married a widow who had a grown daughter. My father, who often visited us, fell in love with my
step-daughter and married her. As a result, my father became my son, and my daughter became
my mother. Some time later, I gave my wife a son, who was the brother of my father, and my
uncle. My father's wife (who is also my daughter and my mother) got a son. As a result, I got a
brother and a grandson in the same person. My wife is now my grandmother, because she is my
mother's mother. So I am the husband of my wife, and at the same time the step-grandson of my
wife. In other words, I'm my own grandpa.

Things get even more strange, when you take surrogates or "fuzzy fatherhood" into account.

How to deal with that


Define cycles as out-of-scope
You could decide that your software should not deal with such rare cases. If such a case occurs, the user
should use a different product. This makes dealing with the more common cases much more robust,
because you can keep more assertions and a simpler data model.

In this case, add some good import and export features to your software, so the user can easily migrate to
a different product when necessary.

Allow manual relations


You could allow the user to add manual relations. These relations are not "first-class citizens", i.e. the
software takes them as-is, doesn't check them and doesn't handle them in the main data model.

The user can then handle rare cases by hand. Your data model will still stay quite simple and your
assertions will survive.
Be careful with manual relations. There is a temptation to make them completely configurable and hence
create a fully configurable data model. This will not work: Your software will not scale, you will get strange
bugs and finally the user interface will become unusable. This anti-pattern is called "soft coding", and "The
daily WTF" is full of examples for that.

Make your data model more flexible, skip assertions, test invariants
The last resort would be making your data model more flexible. You would have to skip nearly all
assertions and base your data model on a full blown graph. As the above example shows, it is easily
possible to be your own grandfather, so you can even have cycles.

In this case, you should extensively test your software. You had to skip nearly all assertions, so there is a
good chance for additional bugs.

Use a test data generator to check unusual test cases. There are quick check libraries for Haskell, Erlang
or C. For Java / Scala there are ScalaCheck and Nyaya. One test idea would be to simulate a random
population, let it interbreed at random, then let your software first import and then export the result. The
expectation would be, that all connections in the output are also in the input and vice verse.

A case, where a property stays the same is called an invariant. In this case, the invariant is the set of
"romantic relations" between the individuals in the simulated population. Try to find as much invariants as
possible and test them with randomly generated data. Invariants can be functional, e.g.:

an uncle stays an uncle, even when you add more "romantic relations"

every child has a parent


a population with two generations has at least one grand-parent

Or they can be technical:

Your software will not crash on a graph up to 10 billion members (no matter how many
interconnections)

Your software scales with O(number-of-nodes) and O(number-of-edges^2)


Your software can save and re-load every family graph up to 10 billion members

By running the simulated tests, you will find lots of strange corner cases. Fixing them will take a lot of time.
Also you will lose a lot of optimizations, your software will run much slower. You have to decide, if it is
worth it and if this is in the scope of your software.

Share edited Jan 26, 2015 at 14:48 answered Jan 26, 2015 at 14:17
stefan.schwetschke
8,912 1 27 30

3 Instead of removing all assertions, you should still check for things like a person being his/her own parent
or other impossible situations and present an error. Maybe issue a warning if it is unlikely so the user can
votes
still detect common input errors, but it will work if everything is correct.

I would store the data in a vector with a permanent integer for each person and store the parents and
children in person objects where the said int is the index of the vector. This would be pretty fast to go
between generations (but slow for things like name searches). The objects would be in order of when they
were created.
Share answered Dec 2, 2011 at 10:19
ctype.h
1,480 4 20 34

-3 Duplicate the father (or use symlink/reference).

votes
For example, if you are using hierarchical database:

$ #each person node has two nodes representing its parents.


$ mkdir Family
$ mkdir Family/Son
$ mkdir Family/Son/Daughter
$ mkdir Family/Son/Father
$ mkdir Family/Son/Daughter/Father
$ ln -s Family/Son/Daughter/Father Family/Son/Father
$ mkdir Family/Son/Daughter/Wife
$ tree Family
Family
└── Son
├── Daughter
│ ├── Father
│ └── Wife
└── Father -> Family/Son/Daughter/Father

4 directories, 1 file

Share answered Jan 13, 2012 at 4:39


numeric
475 4 15

3 The ln -s command doesn't work that way; the resolution of the link Family/Son/Father will look for
Family/Son/Daughter/Father from under Family/Son , where the link resides, not from . where you issued
the ln -s command. – musiphil Jan 13, 2012 at 23:08

48 cloning is prohibited by the geneva conventions – MikeIsrael Nov 8, 2012 at 9:46

You might also like